Seminario Eloisa Vargiu, 06-09-2012
-
Upload
crs4-research-center-in-sardinia -
Category
Documents
-
view
353 -
download
1
description
Transcript of Seminario Eloisa Vargiu, 06-09-2012
![Page 1: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/1.jpg)
Elo
isa
Va
rgiu
(eva
rgiu
@b
dig
ital.o
rg) –
Ca
glia
ri, 6
Se
pte
mb
er 2
01
2
![Page 2: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/2.jpg)
OUTLINE OF THE TALK
Introduction
Online Advertising
A Modern Contextual Advertising System
Syntactic Textual Analysis
Semantic Textual Analysis
Matching
An Example: ConCA
Experimental Results
Conclusions
References
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
![Page 3: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/3.jpg)
INTRODUCTION
![Page 5: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/5.jpg)
OUTERNET & INTERNET
In Atkinson’s view something is missing…
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
![Page 6: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/6.jpg)
OUTERNET & INTERNET
In Atkinson’s view something is missing…
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
![Page 7: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/7.jpg)
OUTERNET & INTERNET
In Atkinson’s view something is missing…
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
![Page 8: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/8.jpg)
OUTERNET & INTERNET
In Atkinson’s view something is missing…
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
![Page 9: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/9.jpg)
OUTERNET & INTERNET
In Atkinson’s view something is missing…
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
![Page 10: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/10.jpg)
ONLINE ADVERTISING
![Page 13: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/13.jpg)
ONLINE ADVERTISING
Banner Advertising
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
![Page 14: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/14.jpg)
ONLINE ADVERTISING
Contextual Advertising
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
![Page 16: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/16.jpg)
ONLINE ADVERTISING
Is it always a good thing?
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
![Page 17: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/17.jpg)
ONLINE ADVERTISING
Is it always a good thing?
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
![Page 18: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/18.jpg)
A MODERN CONTEXTUAL
ADVERTISING SYSTEM
![Page 19: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/19.jpg)
A MODERN CONTEXTUAL ADVERTISING SYSTEM
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
![Page 20: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/20.jpg)
SYNTACTIC TEXTUAL ANALYSIS
Text Summarization
Bag of Words Representation
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
![Page 21: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/21.jpg)
SYNTACTIC TEXTUAL ANALYSIS
Text summarization
State of the art techniques
First and Last Paragraph (FLP)
Title, First and Last Paragraph (TFLP)
Snippet (S)
Title and Snippet (TS)
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
![Page 22: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/22.jpg)
SYNTACTIC TEXTUAL ANALYSIS
First and Last Paragraph (FLP)
You don’t need to shell out thousands,
survive various ballots, or swap a family
member for a ticket to enjoy the 2012
Summer Olympic Games this year. There's
all manner of free events and associated
shenanigans taking place in London and
across the UK to mark the occasion. Here
are ten ways to join in without spending any
money.
http://www.roughguides.com/website/Travel/SpotLight/ViewSpotLight.aspx?spotLightID=575
Indulge in a family feast
Volunteer chefs at 24 Sure Start Centres
across the UK are preparing to dish up free
delights throughout the period. Details,
along with all the other events that make up
the Cultural Olympiad, are available on the
site.
![Page 23: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/23.jpg)
SYNTACTIC TEXTUAL ANALYSIS
Title, First and Last Paragraph (TFLP)
http://www.roughguides.com/website/Travel/SpotLight/ViewSpotLight.aspx?spotLightID=575
You don’t need to shell out thousands,
survive various ballots, or swap a family
member for a ticket to enjoy the 2012
Summer Olympic Games this year. There's
all manner of free events and associated
shenanigans taking place in London and
across the UK to mark the occasion. Here
are ten ways to join in without spending any
money.
Indulge in a family feast
Volunteer chefs at 24 Sure Start Centres
across the UK are preparing to dish up free
delights throughout the period. Details,
along with all the other events that make up
the Cultural Olympiad, are available on the
site.
![Page 24: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/24.jpg)
SYNTACTIC TEXTUAL ANALYSIS
Title, First and Last Paragraph (TFLP)
http://www.roughguides.com/website/Travel/SpotLight/ViewSpotLight.aspx?spotLightID=575
You don’t need to shell out thousands,
survive various ballots, or swap a family
member for a ticket to enjoy the 2012
Summer Olympic Games this year. There's
all manner of free events and associated
shenanigans taking place in London and
across the UK to mark the occasion. Here
are ten ways to join in without spending any
money.
Indulge in a family feast
Volunteer chefs at 24 Sure Start Centres
across the UK are preparing to dish up free
delights throughout the period. Details,
along with all the other events that make up
the Cultural Olympiad, are available on the
site.
London 2012 – Ten ways to celebrate the Olympics for free
![Page 25: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/25.jpg)
SYNTACTIC TEXTUAL ANALYSIS
Snippet (S)
http://www.roughguides.com/website/Travel/SpotLight/ViewSpotLight.aspx?spotLightID=575
![Page 26: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/26.jpg)
SYNTACTIC TEXTUAL ANALYSIS
Title and Snippet (TS)
http://www.roughguides.com/website/Travel/SpotLight/ViewSpotLight.aspx?spotLightID=575
![Page 27: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/27.jpg)
SYNTACTIC TEXTUAL ANALYSIS
Bag of Words (BoW) representation
Dimensionality reduction
Stop-words removal
Stemming
Vector representation
Set of pairs <word, occurrences>
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
![Page 28: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/28.jpg)
SYNTACTIC TEXTUAL ANALYSIS
Stop-words removal
You don’t need to shell out thousands,
survive various ballots, or swap a
family member for a ticket to enjoy the
2012 Summer Olympic Games this
year. There's all manner of free events
and associated shenanigans taking
place in London and across the UK to
mark the occasion. Here are ten ways
to join in without spending any money.
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
![Page 29: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/29.jpg)
SYNTACTIC TEXTUAL ANALYSIS
Stop-words removal
You don’t need to shell out thousands,
survive various ballots, or swap a
family member for a ticket to enjoy the
2012 Summer Olympic Games this
year. There's all manner of free events
and associated shenanigans taking
place in London and across the UK to
mark the occasion. Here are ten ways
to join in without spending any money.
X X X X X X
X X X X
X X X X
X
X
X
X X X X X X
X X X X
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
![Page 30: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/30.jpg)
SYNTACTIC TEXTUAL ANALYSIS
Stop-words removal
You don’t need to shell out thousands,
survive various ballots, or swap a
family member for a ticket to enjoy the
2012 Summer Olympic Games this
year. There's all manner of free events
and associated shenanigans taking
place in London and across the UK to
mark the occasion. Here are ten ways
to join in without spending any money.
X X X X X X
X X X X
X X X X
X
X
X
X X X X X X
X X X X
Shell thousands, survive various
ballots, swap family member ticket
enjoy 2012 Summer Olympic Games
year. Manner free events associated
shenanigans taking place London
across UK mark occasion. ten ways
join spending money.
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
![Page 31: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/31.jpg)
SYNTACTIC TEXTUAL ANALYSIS
Stemming
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
Shell thousands, survive various
ballots, swap family member ticket
enjoy 2012 Summer Olympic Games
year. Manner free events associated
shenanigans taking place London
across UK mark occasion. ten ways
join spending money.
![Page 32: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/32.jpg)
SYNTACTIC TEXTUAL ANALYSIS
Stemming
Shell thousands, survive various
ballots, swap family member ticket
enjoy 2012 Summer Olympic Games
year. Manner free events associated
shenanigans taking place London
across UK mark occasion. ten ways
join spending money.
X X X
X
X X X X
X X
X
X
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
![Page 33: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/33.jpg)
SYNTACTIC TEXTUAL ANALYSIS
Stemming
Shell thousands, survive various
ballots, swap family member ticket
enjoy 2012 Summer Olympic Games
year. Manner free events associated
shenanigans taking place London
across UK mark occasion. ten ways
join spending money.
X X X
X
X X X X
X X
X
X Shell thousand, surviv various ballot,
swap famil member ticket enjoy 2012
Summer Olymp Game year. Manner
free event associat shenanigan tak
place London across UK mark
occasion. ten way join spend money.
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
![Page 34: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/34.jpg)
SYNTACTIC TEXTUAL ANALYSIS
Vector representation
TFIDF
<free0.0116>
<olymp, 0.0235>
<event, 0.0012>
<way, 0.0125>
<london, 0.0421>
<celebrat, 0.0005>
<chef, 0.0127>
…
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
![Page 35: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/35.jpg)
IS ENOUGH THE SOLE SYNTACTIC APPROACH?
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
![Page 36: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/36.jpg)
IS ENOUGH THE SOLE SYNTACTIC APPROACH?
Polysemy…
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
“BASS”
![Page 37: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/37.jpg)
IS ENOUGH THE SOLE SYNTACTIC APPROACH?
Synonymity…
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
Vehicle
Car
Automobile
Auto
Machine
![Page 38: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/38.jpg)
SEMANTIC TEXTUAL ANALYSIS
Taxonomy-based Classification
Word Disambiguation
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
![Page 39: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/39.jpg)
SEMANTIC TEXTUAL ANALYSIS
Taxonomy-based Classification
Classification Features (CF) representation
Adopted classifiers
Rocchio
SVM
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
![Page 40: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/40.jpg)
SEMANTIC TEXTUAL ANALYSIS
Rocchio
Each centroid is defined as a sum of TF-IDF values of each
term, normalized by the number of webpages in the class
The classification is based on the
cosine of the angle between the
webpage and the centroid of each class
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
![Page 41: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/41.jpg)
SEMANTIC TEXTUAL ANALYSIS
SVM
The score is related to the
distance of the webpage from a
separation hyperplane
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
![Page 42: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/42.jpg)
SEMANTIC TEXTUAL ANALYSIS
Word Disambiguation Bag of Concepts (BoC) representation
Adopted lexical supports WordNet
YAGO
ConceptNet
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
![Page 43: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/43.jpg)
SEMANTIC TEXTUAL ANALYSIS
WordNet
A large lexical database
of English. Nouns, verbs,
adjectives and adverbs
are grouped into sets of
cognitive synonyms
(synsets), each
expressing a distinct
concept.
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
![Page 44: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/44.jpg)
SEMANTIC TEXTUAL ANALYSIS
YAGO
A semantic knowledge base, derived from Wikipedia,
WordNet and GeoNames
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
![Page 45: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/45.jpg)
SEMANTIC TEXTUAL ANALYSIS
ConceptNet
A network of concepts connected by several semantic
relations (e.g., “IsA”, “PartOf”)
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
![Page 46: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/46.jpg)
MATCHING
Similarity calculation
Ranking
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
![Page 47: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/47.jpg)
MATCHING
Similarity calculation
Adopted approaches
Cosine similarity
Jaccard index
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
![Page 49: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/49.jpg)
MATCHING
o Jaccard index
The Jaccard coefficient measures similarity between sample sets,
and is defined as the size of the intersection divided by the size of
the union of the sample sets
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
![Page 50: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/50.jpg)
MATCHING
Ranking
Adopted approaches
Simple ranking according to the calculated scores
Learning to rank model
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
![Page 51: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/51.jpg)
MATCHING
o Learning to rank model
Pointwise approach
o Each query-document pair in the training data has a numerical
or ordinal score
o Regression problem approach: given a single query-document
pair, predict its score
Pairwise approach
o Classification problem approach: learning a binary classifier
which can tell which document is better in a given pair of
documents
Listwise approach
o Optimization problem approach: try to directly optimize the
value of one of the above evaluation measures, averaged over
all queries in the training data
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
![Page 52: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/52.jpg)
AN EXAMPLE: CONCA
CONCEPTS IN CONTEXTUAL ADVERTISING
![Page 55: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/55.jpg)
RESULTS
SYNTAX VS SEMANTICS
![Page 56: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/56.jpg)
SYNTACTICAL ANALYSIS
Text summarization techniques comparison
FLP vs TFLP vs S vs TS
Comparison metrics
Taxonomy
BankSearch Dataset
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
FPTP
TP
documentsretrieved
documentsretrieveddocumentsrelevant
|}{|
|}{}{|
FNTP
TP
documentsrelevant
documentsretrieveddocumentsrelevant
|}{|
|}{}{|
21F
![Page 57: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/57.jpg)
SYNTACTICAL ANALYSIS
Results
Adding information about the title improves the
performances
TFLP has the best performance
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
FLP TFLP S TS
π 0.745 0.832 0.734 0.806
ρ 0.719 0.801 0.730 0.804
F1 0.732 0.816 0.732 0.805
#t 24 26 12 14
![Page 58: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/58.jpg)
SEMANTIC ANALYSIS
Semantic approaches comparison
Anagnostopoulos et al. (2007) system vs Armano et al.
(2011-TIR) vs ConCA
Matching function
Comparison metric
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
CFBoC simsimap )1(),(
N
i
k
j
ijij
N
i
k
j
ij
FPTP
TP
k
1 1
1 1
)(
@
![Page 59: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/59.jpg)
SEMANTIC ANALYSIS
Ad repository
Built by hand by a domain expert
Taxonomy
BankSearch Dataset
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
![Page 60: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/60.jpg)
SEMANTIC ANALYSIS
Results
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
k Anagnostopoulos
et al.
Armano et al. ConCA
π α π α π α
1 0.674 0 0.768 0.2 0.773 0.1
2 0.653 0 0.750 0.2 0.752 0.1
3 0.617 0.2 0.729 0.3 0.728 0.1
4 0.582 0.2 0.701 0.3 0.701 0.1
5 0.546 0.1 0.663 0.0 0.668 0.1
![Page 62: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/62.jpg)
SEMANTIC ANALYSIS
Results
Slight improvement by using concepts
Low values of α → CF more impact then BoC
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
![Page 63: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/63.jpg)
SYNTACTICAL ANALYSIS VS SEMANTIC ANALYSIS
Contextual Advertising System
Armano et al. (2011-TIR)
Matching function
Comparisons varying α
α = 1 → pure syntax
α = 0 → pure semantics
Comparison metric
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
CFBoW simsimap )1(),(
N
i
k
j
ijij
N
i
k
j
ij
FPTP
TP
k
1 1
1 1
)(
@
![Page 64: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/64.jpg)
SYNTACTICAL ANALYSIS VS SEMANTIC ANALYSIS
Ad repository
Built by hand by a domain expert
Taxonomy
BankSearch Dataset
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
![Page 65: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/65.jpg)
SYNTACTICAL ANALYSIS VS SEMANTIC ANALYSIS
Results
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
α π@1 π@2 π@3 π@4 π@5
0 0.765 0.746 0.719 0.696 0.663
0.1 0.767 0.749 0.724 0.698 0.663
0.2 0.768 0.750 0.729 0.699 0.662
0.3 0.766 0.749 0.729 0.701 0.661
0.4 0.756 0.747 0.729 0.698 0.658
0.5 0.744 0.735 0.721 0.693 0.651
0.6 0.722 0.717 0.703 0.681 0.640
0.7 0.685 0.687 0.680 0.658 0.625
0,8 0.632 0.637 0.635 0.614 0.586
0.9 0.557 0.552 0.548 0.534 0.512
1 0.408 0.421 0.372 0.388 0.640
![Page 66: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/66.jpg)
CONCLUSIONS
![Page 67: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/67.jpg)
CONCLUSIONS
Online advertising
represents one of the major sources of income for a large
number of websites
is aimed at suggesting products and services to the
population of Internet users
Modern contextual advertising systems
put ads within the content of a generic, third party,
webpage
adopt both syntactical and semantic textual analyses to
select the most relevant ads for a given webpage
an example is ConCA
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
![Page 68: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/68.jpg)
CONCLUSIONS
Results show that
the impact of semantics is stronger than that of syntax
adopting more advanced semantic techniques, such as
concepts, improves the performances
the more the suggested ads are, the worse the
performance is
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
![Page 69: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/69.jpg)
REFERENCES
![Page 70: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/70.jpg)
REFERENCES
Syntactical Textual Analysis Armano G., Giuliani A., & Vargiu E. Experimenting text summarization
techniques for contextual advertising. 2nd Italian Information Retrieval
Workshop (IIR’11) , 2011.
Armano G., Giuliani A. & Vargiu, E. Using snippets in text summarization: a
comparative study and an application. 3rd Italian Information Retrieval
Workshop (IIR’12), 2012.
Kolcz A., Prabakarmurthi V. & Kalita J. Summarization as feature selection for
text categorization. 10th International Conference on Information and
Knowledge Management (CIKM’01). ACM, New York, NY, USA, pp. 365–370,
2001.
Porter M. An algorithm for suffix stripping. Program 14, 3, 130–137, 1980.
Salton G., Wong A. & Yang C.S, A vector space model for automatic indexing,
Communications of the ACM, 18, 11, pp.613-620, 1975.
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
![Page 71: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/71.jpg)
REFERENCES
Semantic Textual Analysis Cortes C. & Vapnik, V.N. Support-Vector Networks, Machine Learning, 20,
1995.
Fellbaum C. WordNet: An Electronic Lexical Database. Cambridge, MA: MIT
Press, 1998.
Liu H. & Singh P. ConceptNet: A practical commonsense reasoning tool-kit. BT
Technology Journal 22, pp. 211–226, 2004.
Miller G.A. WordNet: A Lexical Database for English. Communications of the
ACM, 38, 11, pp. 39-41, 1995.
Rocchio J. The SMART Retrieval System: Experiments in Automatic Document
Processing. PrenticeHall, Chapter: Relevance feedback in information
retrieval, pp. 313–323, 1971.
Suchanek F.M., Kasneci G. & Weikum G. Yago - A Core of Semantic
Knowledge. 16th International World Wide Web conference (WWW 2007),
2007.
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
![Page 72: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/72.jpg)
REFERENCES
Matching Liu T.Y. Learning to rank for information retrieval. Found. Trends Inf. Retr. 3, 3,
pp. 225–331, 2009.
Radomski P.J. & Goeman, T.J. The homogenizing of Minnesota lake fish
assemblages. Fisheries, 20, pp. 20–23, 1995.
Comparison Systems Anagnostopoulos A., Broder A. Z., Gabrilovich E., Josifovski V. & Riedel L. Just-
in-time contextual advertising. 16th ACM Conference on Information and
Knowledge Management (CIKM’07). ACM, New York, NY, USA, pp. 331–340,
2007.
Armano G., Giuliani A. & Vargiu E. Studying the impact of text summarization
on contextual advertising. 8th International Workshop on Text-based
Information Retrieval (TIR’11), 2011.
Armano G., Giuliani A. & Vargiu E. Semantic enrichment of contextual
advertising by using concepts. International Conference on Knowledge
Discovery and Information Retrieval, 2011.
Eloisa Vargiu ([email protected]) – Cagliari, 6 September 2012
![Page 73: Seminario Eloisa Vargiu, 06-09-2012](https://reader033.fdocuments.in/reader033/viewer/2022051211/55599c38d8b42aa4288b4571/html5/thumbnails/73.jpg)
Contact: Eloisa Vargiu – [email protected]