Liangliang Cao - University Of Illinoiscao4/papers/geo_photomemex_talk.pdf · Recommendation...

35
Geophoto Memex Liangliang Cao

Transcript of Liangliang Cao - University Of Illinoiscao4/papers/geo_photomemex_talk.pdf · Recommendation...

Page 1: Liangliang Cao - University Of Illinoiscao4/papers/geo_photomemex_talk.pdf · Recommendation Geographical Topics in Social Media Project 1 Project 2 Project 3 . Geographical & Semantic

Geophoto Memex

Liangliang Cao

Page 2: Liangliang Cao - University Of Illinoiscao4/papers/geo_photomemex_talk.pdf · Recommendation Geographical Topics in Social Media Project 1 Project 2 Project 3 . Geographical & Semantic

What is “Geophoto Memex”?

• Geophoto Memex: Record all the photos that are associated with locations in

the world, and provide geographical analytics on request.

• Our related papers: – ACMMM 2009: “Enhancing semantic and geographic annotation

of web images via logistic canonical correlation regression”

– ICASSP 2010: “A worldwide tourism recommendation system based on geotagged web photos”

– SDM 2011: “Diversified Trajectory Pattern Ranking in Geo-tagged Social Media”

– WWW 2011: “Geographical topic discovery and comparison”

Page 3: Liangliang Cao - University Of Illinoiscao4/papers/geo_photomemex_talk.pdf · Recommendation Geographical Topics in Social Media Project 1 Project 2 Project 3 . Geographical & Semantic

Geophotos

• Where is the data from? − Advanced cameras with GPS receivers

− GPS sensor in smart phones

− Web Apps including Google Earth, Flickr, Twitter

3

Page 4: Liangliang Cao - University Of Illinoiscao4/papers/geo_photomemex_talk.pdf · Recommendation Geographical Topics in Social Media Project 1 Project 2 Project 3 . Geographical & Semantic

Project Overview

Data Collection

Data Cleaning

Data

Analytics

4

• Already collected 1M geo-tagged photos

• Aim to collect

− 100+M geo-photo from Flickr

− More geo-tagged document from Twitter

Page 5: Liangliang Cao - University Of Illinoiscao4/papers/geo_photomemex_talk.pdf · Recommendation Geographical Topics in Social Media Project 1 Project 2 Project 3 . Geographical & Semantic

Project Overview

Data Collection

Data Cleaning

Data

Analytics

5

• Remove the

label ambiguity

• Refine the

annotation

Project 1

Page 6: Liangliang Cao - University Of Illinoiscao4/papers/geo_photomemex_talk.pdf · Recommendation Geographical Topics in Social Media Project 1 Project 2 Project 3 . Geographical & Semantic

Project Overview

Data Collection

Data Cleaning

Data

Analytics

6

• Tourism

Recommendation

• Geo-info discovery

• User interest mining

Project 2 Project 3

Page 7: Liangliang Cao - University Of Illinoiscao4/papers/geo_photomemex_talk.pdf · Recommendation Geographical Topics in Social Media Project 1 Project 2 Project 3 . Geographical & Semantic

Our Projects

7

Geographical &

Semantic Annotation

Tourism

Recommendation Geographical Topics in

Social Media

Project 1 Project 2 Project 3

Page 8: Liangliang Cao - University Of Illinoiscao4/papers/geo_photomemex_talk.pdf · Recommendation Geographical Topics in Social Media Project 1 Project 2 Project 3 . Geographical & Semantic

Geographical & Semantic Annotation

8

2006, clouds, sc,

d50, mywinners,

nikon, pond,

reflections, sun,

september

• User-provided annotation are usually limited and noisy.

– ambiguous or irrelevant labels

– Only a small amount of photos are geo-tagged

• Is it possible to refine and enrich these annotations?

– Large scale visual recognition can help.

– We combine both visual feature and tag features into the classifiers.

Page 9: Liangliang Cao - University Of Illinoiscao4/papers/geo_photomemex_talk.pdf · Recommendation Geographical Topics in Social Media Project 1 Project 2 Project 3 . Geographical & Semantic

Geographical & Semantic Annotation

9

2006, clouds, sc,

d50, mywinners,

nikon, pond,

reflections, sun,

september

Annotation questions:

– What exists in the image?

– Where was the image taken?

Page 10: Liangliang Cao - University Of Illinoiscao4/papers/geo_photomemex_talk.pdf · Recommendation Geographical Topics in Social Media Project 1 Project 2 Project 3 . Geographical & Semantic

Geographical & Semantic Annotation

10

2006, clouds, sc,

d50, mywinners,

nikon, pond,

reflections, sun,

september

• Annotation cue lies in different features.

• We train a model using Flickr data to annotate the images automatically.

Image

Page 11: Liangliang Cao - University Of Illinoiscao4/papers/geo_photomemex_talk.pdf · Recommendation Geographical Topics in Social Media Project 1 Project 2 Project 3 . Geographical & Semantic

Geographical & Semantic Annotation

11

2006, clouds, sc,

d50, mywinners,

nikon, pond,

reflections, sun,

september nature, sky, water

New

Annotation

Page 12: Liangliang Cao - University Of Illinoiscao4/papers/geo_photomemex_talk.pdf · Recommendation Geographical Topics in Social Media Project 1 Project 2 Project 3 . Geographical & Semantic

Combining Visual Features and Noisy Labels

• There are multiple features for online images

– Visual features: color, shape, GIST…

– Noisy annotations

• We explore the canonical correlations between multiple feature and use them to enrich the annotations

12

Page 13: Liangliang Cao - University Of Illinoiscao4/papers/geo_photomemex_talk.pdf · Recommendation Geographical Topics in Social Media Project 1 Project 2 Project 3 . Geographical & Semantic

Canonical Correlation Analysis

Let x and y represent two feature vectors, CCA looks for the projection

where the optimal a, b maximize the correlation in projected subspaces

It is easy to show that the solution can be found by solving the general eigen decomposition problem

13

Page 14: Liangliang Cao - University Of Illinoiscao4/papers/geo_photomemex_talk.pdf · Recommendation Geographical Topics in Social Media Project 1 Project 2 Project 3 . Geographical & Semantic

CCA for a Toy Example

Neither of the two dimensions in original space characterizes the

linear correlation. However, after projecting the data into the

canonical space, we can see the linear correlation clearly.

14

Page 15: Liangliang Cao - University Of Illinoiscao4/papers/geo_photomemex_talk.pdf · Recommendation Geographical Topics in Social Media Project 1 Project 2 Project 3 . Geographical & Semantic

Logistic Canonical Correlation Regression

• Given multiple features, we can compute the canonical correlations between the feature and a given label.

• To combine the clues from multiple features, we employ the logistic canonical correlation regression (LCCR) model, which maximizes the likelihood

• The estimated function is

where is the correlation between label and the m-th feature, is the parameter for the logistic model.

15

Page 16: Liangliang Cao - University Of Illinoiscao4/papers/geo_photomemex_talk.pdf · Recommendation Geographical Topics in Social Media Project 1 Project 2 Project 3 . Geographical & Semantic

Dataset

• We collect 380,573 images with tags and GPS records from Flickr.

• The number of tags for each image

– varies from zero to over ten

– the average number is 4.96 tags per image.

• The GPS location:

– The scope of the geographic areas is within the North America (users in other areas may use different languages for tags)

16

Page 17: Liangliang Cao - University Of Illinoiscao4/papers/geo_photomemex_talk.pdf · Recommendation Geographical Topics in Social Media Project 1 Project 2 Project 3 . Geographical & Semantic

Enhancing Semantic Annotation

• We employ 66 semantic concepts (tags) for semantic annotation: most popular labels in Flickr

• We train our LCCR model based on multiple features

– 6 visual features: LAB color histogram, GIST, tiny image, LAB color of tiny image, image projection in PCA and LDA spaces.

– Existing tag features: we remove the terms that are the same as labeling concepts in the process of both training and testing because they are the very annotation we are trying to predict.

17

Page 18: Liangliang Cao - University Of Illinoiscao4/papers/geo_photomemex_talk.pdf · Recommendation Geographical Topics in Social Media Project 1 Project 2 Project 3 . Geographical & Semantic

More Examples

canon, water, ocean, wildlife, fish, 20d, seagull, feathery friday, camping, gull

river, colors, sony, quebec, minolta, paysage, a100, automne

nikon, red, green, usa, flower, purple, october, plants, texture, flora, illinois, natural, pattern, wallpaper

Page 19: Liangliang Cao - University Of Illinoiscao4/papers/geo_photomemex_talk.pdf · Recommendation Geographical Topics in Social Media Project 1 Project 2 Project 3 . Geographical & Semantic

More Examples

canon, water, ocean, wildlife, fish, 20d, seagull, feathery friday, camping, gull, bird, nature, sea

river, colors, sony, quebec, minolta, paysage, a100, automne, autumn, fall, landscape

nikon, red, green, usa, flower, purple, october, plants, texture, flora, illinois, natural, pattern, wallpaper, autumn, fall, flower, garden, nature

Page 20: Liangliang Cao - University Of Illinoiscao4/papers/geo_photomemex_talk.pdf · Recommendation Geographical Topics in Social Media Project 1 Project 2 Project 3 . Geographical & Semantic

More Examples

canon, water, ocean, wildlife, fish, 20d, seagull, feathery friday, camping, gull, bird, nature, sea

20

river, colors, sony, quebec, minolta, paysage, a100, automne, autumn, fall, landscape

nikon, red, green, usa, flower, purple, october, plants, texture, flora, illinois, natural, pattern, wallpaper, autumn, fall, flower, garden, nature

Page 21: Liangliang Cao - University Of Illinoiscao4/papers/geo_photomemex_talk.pdf · Recommendation Geographical Topics in Social Media Project 1 Project 2 Project 3 . Geographical & Semantic

Evaluation: Geographical Annotation

21

Page 22: Liangliang Cao - University Of Illinoiscao4/papers/geo_photomemex_talk.pdf · Recommendation Geographical Topics in Social Media Project 1 Project 2 Project 3 . Geographical & Semantic

Evaluation Semantic Annotation

22

Page 23: Liangliang Cao - University Of Illinoiscao4/papers/geo_photomemex_talk.pdf · Recommendation Geographical Topics in Social Media Project 1 Project 2 Project 3 . Geographical & Semantic

Our Projects

23

Geographical &

Semantic Annotation

Tourism

Recommendation Geographical Topics in

Social Media

Project 1 Project 2 Project 3

Page 24: Liangliang Cao - University Of Illinoiscao4/papers/geo_photomemex_talk.pdf · Recommendation Geographical Topics in Social Media Project 1 Project 2 Project 3 . Geographical & Semantic

Tourism Recommendation from Image Retrieval

24

Query

Recommended places

Similar photo

from indexed

dataset

Page 25: Liangliang Cao - University Of Illinoiscao4/papers/geo_photomemex_talk.pdf · Recommendation Geographical Topics in Social Media Project 1 Project 2 Project 3 . Geographical & Semantic

Find Popular Attractions

Popular attraction are usually those with many photos.

(different color denotes different attractions)

Page 26: Liangliang Cao - University Of Illinoiscao4/papers/geo_photomemex_talk.pdf · Recommendation Geographical Topics in Social Media Project 1 Project 2 Project 3 . Geographical & Semantic

Mining Top Tourist Routes in Big Cities

26

London Eye →

Big Ben →

Downing Street →

Horse Guards →

Trafalgar Square

Apple Store →

St.patrick Cathedral →

Rockefeller Center

Eiffel Tower →

Louvre →

Notredame

Page 27: Liangliang Cao - University Of Illinoiscao4/papers/geo_photomemex_talk.pdf · Recommendation Geographical Topics in Social Media Project 1 Project 2 Project 3 . Geographical & Semantic

Our Projects

27

Geographical &

Semantic Annotation

Tourism

Recommendation Geographical Topics in

Social Media

Project 1 Project 2 Project 3

Page 28: Liangliang Cao - University Of Illinoiscao4/papers/geo_photomemex_talk.pdf · Recommendation Geographical Topics in Social Media Project 1 Project 2 Project 3 . Geographical & Semantic

Topics over Geographical Regions [WWW’ 11]

28

Input:

Output:

1. Geographic topics 2. Topics at a location

Page 29: Liangliang Cao - University Of Illinoiscao4/papers/geo_photomemex_talk.pdf · Recommendation Geographical Topics in Social Media Project 1 Project 2 Project 3 . Geographical & Semantic

Motivations

• Goal: – Analyze the cultural differences around the world

– Explore the hot topics or events in different places

– Compare the popularity of specific products in different regions

• Latent Geographical Topic Analysis – The topics are generated from regions instead of

documents

– If two words are close to each other in space, they are more likely to belong to the same region

– If two words are from the same region, they are more likely to be clustered into the same topic

29

Page 30: Liangliang Cao - University Of Illinoiscao4/papers/geo_photomemex_talk.pdf · Recommendation Geographical Topics in Social Media Project 1 Project 2 Project 3 . Geographical & Semantic

Latent Geographical Topic Analysis

region

importance

(N-d vector)

region geo-information

{p(z|r)} {p(w|z)}

location shape

30

Page 31: Liangliang Cao - University Of Illinoiscao4/papers/geo_photomemex_talk.pdf · Recommendation Geographical Topics in Social Media Project 1 Project 2 Project 3 . Geographical & Semantic

Location/Text Perplexity

}

N

)l,p(w log

exp{)(Dperplexity

test

test

Dd d

Dd dd

testextlocation/t

Page 32: Liangliang Cao - University Of Illinoiscao4/papers/geo_photomemex_talk.pdf · Recommendation Geographical Topics in Social Media Project 1 Project 2 Project 3 . Geographical & Semantic

Geographical Topic Comparison

• Food dataset

Page 33: Liangliang Cao - University Of Illinoiscao4/papers/geo_photomemex_talk.pdf · Recommendation Geographical Topics in Social Media Project 1 Project 2 Project 3 . Geographical & Semantic

Topics over Geographical Regions

33

Italian food

Japanese food Chinese food

Spanish food Mexican food

French food

The redness

represents the

probability of

each topic at a

location.

Page 34: Liangliang Cao - University Of Illinoiscao4/papers/geo_photomemex_talk.pdf · Recommendation Geographical Topics in Social Media Project 1 Project 2 Project 3 . Geographical & Semantic

Distinguish Different Landscapes

34

Beach Desert Mountains

Page 35: Liangliang Cao - University Of Illinoiscao4/papers/geo_photomemex_talk.pdf · Recommendation Geographical Topics in Social Media Project 1 Project 2 Project 3 . Geographical & Semantic

Acknowledgement To My Terrific Collaborators

Thomas Huang Jiawei Han

Zhijun Yin Jiebo Luo Andew Gallagher