Semantics In Digital Photos A Contenxtual Analysis
-
Upload
allenwu -
Category
Technology
-
view
920 -
download
2
description
Transcript of Semantics In Digital Photos A Contenxtual Analysis
![Page 1: Semantics In Digital Photos A Contenxtual Analysis](https://reader035.fdocuments.in/reader035/viewer/2022062419/55770b1fd8b42a0e058b5409/html5/thumbnails/1.jpg)
Semantics in Digital Photos: a Contenxtual
AnalysisAuthor / Pinaki Sinha, Ramesh JainConference / The IEEE International
Conference on Semantic Computing, 2008, p58. – p.65
Presenter / Meng-Lun, Wu
1
![Page 2: Semantics In Digital Photos A Contenxtual Analysis](https://reader035.fdocuments.in/reader035/viewer/2022062419/55770b1fd8b42a0e058b5409/html5/thumbnails/2.jpg)
Outline
Introduction Related Work The Optical Context Layer Photo Clustering Photo Classification Annotation in Digital Photos Results Conclusion
2
![Page 3: Semantics In Digital Photos A Contenxtual Analysis](https://reader035.fdocuments.in/reader035/viewer/2022062419/55770b1fd8b42a0e058b5409/html5/thumbnails/3.jpg)
Introduction
Most research is concerned with extracting semantics using content information only.
All search engines rely on the text associated with the images to search for images.
Authors fuse the content of photos with two type of context using a probabilistic model.
3
![Page 4: Semantics In Digital Photos A Contenxtual Analysis](https://reader035.fdocuments.in/reader035/viewer/2022062419/55770b1fd8b42a0e058b5409/html5/thumbnails/4.jpg)
Introduction (cont.)
4
![Page 5: Semantics In Digital Photos A Contenxtual Analysis](https://reader035.fdocuments.in/reader035/viewer/2022062419/55770b1fd8b42a0e058b5409/html5/thumbnails/5.jpg)
Introduction (cont.)
This paper classify photos into mutually exclusive classes and automatically tagging new photos.
Authors collected the photo dataset from flickr, which publishes popular tags.
5
![Page 6: Semantics In Digital Photos A Contenxtual Analysis](https://reader035.fdocuments.in/reader035/viewer/2022062419/55770b1fd8b42a0e058b5409/html5/thumbnails/6.jpg)
Related Work
Most research use content based pixel features like global features or local features.
Image search using an example input image or query using low level features might be difficult and no intuitive to most people.
Correlations among image features and human tags or labels have been studied.
The semantic gap in image retrieval can’t be overcome using pixel features alone.
6
![Page 7: Semantics In Digital Photos A Contenxtual Analysis](https://reader035.fdocuments.in/reader035/viewer/2022062419/55770b1fd8b42a0e058b5409/html5/thumbnails/7.jpg)
Related Work (cont.)
Recent research has used the optical Context Layer to classify photos.
Boutell and Luo[3] use pixel values and optical metadata for classification.[3] M. Boutell and J. Luo. Bayesian fusion of camera
metadata cues in semantic scene classification. In Proc. IEEE CVPR, 2004.
Model[6] by fusing ontology.[6] P.Duygulu, K.Barnard, N. de Freitas, and D. Forsyth.
Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In Proc. ECCV, 2002.
7
![Page 8: Semantics In Digital Photos A Contenxtual Analysis](https://reader035.fdocuments.in/reader035/viewer/2022062419/55770b1fd8b42a0e058b5409/html5/thumbnails/8.jpg)
The Optical Context Layer
The Exchangeable Image File Standard (EXIF) specifies the camera parameters recorded.
Fundamental parameters Exposure Time, Focal Length, F-number,
Flash, Metering mode and ISO.
8
![Page 9: Semantics In Digital Photos A Contenxtual Analysis](https://reader035.fdocuments.in/reader035/viewer/2022062419/55770b1fd8b42a0e058b5409/html5/thumbnails/9.jpg)
Photo Clustering
LogLight metric will have a small value when the ambient light is high.
Similarly it will have a large value if the outdoor light is small.
)lg( 2FLISOAAETKtricLogLightMe
9
![Page 10: Semantics In Digital Photos A Contenxtual Analysis](https://reader035.fdocuments.in/reader035/viewer/2022062419/55770b1fd8b42a0e058b5409/html5/thumbnails/10.jpg)
Photo Clustering (cont.)
Log-Light distribution of photos shot with flash and without flash as a mixture of Gaussians.
Use the Bayesian model Selection to find the optimal model and the Expectation Maximization (EM) algorithm to fit the model parameters.
10
Image
?Optic
al
![Page 11: Semantics In Digital Photos A Contenxtual Analysis](https://reader035.fdocuments.in/reader035/viewer/2022062419/55770b1fd8b42a0e058b5409/html5/thumbnails/11.jpg)
Photo Clustering (cont.)
According to the above method, we generated 8 clusters.
We choose 3500 tagged photos. We find the probability of each photo. We assign the photo to the cluster
having maximum probability. We assign all tags of the photo to
that particular cluster.
11
![Page 12: Semantics In Digital Photos A Contenxtual Analysis](https://reader035.fdocuments.in/reader035/viewer/2022062419/55770b1fd8b42a0e058b5409/html5/thumbnails/12.jpg)
Photo Clustering (cont.)
Cluster with High Exposure Time Shots
Cluster with No Flash
12
![Page 13: Semantics In Digital Photos A Contenxtual Analysis](https://reader035.fdocuments.in/reader035/viewer/2022062419/55770b1fd8b42a0e058b5409/html5/thumbnails/13.jpg)
Photo Clustering (cont.)
Cluster with Indoor Shots
13
![Page 14: Semantics In Digital Photos A Contenxtual Analysis](https://reader035.fdocuments.in/reader035/viewer/2022062419/55770b1fd8b42a0e058b5409/html5/thumbnails/14.jpg)
Photo Classification
The intent of the photographer is somehow hidden in the optical data.
These classes are outdoor day, outdoor night and indoors.
The classes should be represented different lighting condition in the LogLight metric.
14
![Page 15: Semantics In Digital Photos A Contenxtual Analysis](https://reader035.fdocuments.in/reader035/viewer/2022062419/55770b1fd8b42a0e058b5409/html5/thumbnails/15.jpg)
Photo Classification (cont.)
The Classification problem using Optical Context only and also using Optical Context and Thumbnail pixel features.
Classification algorithms is decision trees.
15
![Page 16: Semantics In Digital Photos A Contenxtual Analysis](https://reader035.fdocuments.in/reader035/viewer/2022062419/55770b1fd8b42a0e058b5409/html5/thumbnails/16.jpg)
Photo Classification (cont.)
16
![Page 17: Semantics In Digital Photos A Contenxtual Analysis](https://reader035.fdocuments.in/reader035/viewer/2022062419/55770b1fd8b42a0e058b5409/html5/thumbnails/17.jpg)
Annotation in Digital Photos
The goal for automatic annotation is to predict words for tagging untagged photos.
Relevance model approach has become quite popular for automatic annotation and retrieval of images.
Automatic annotation is modeled as a language translation problem.
The baseline is continuous relevance model(CRM).
17
![Page 18: Semantics In Digital Photos A Contenxtual Analysis](https://reader035.fdocuments.in/reader035/viewer/2022062419/55770b1fd8b42a0e058b5409/html5/thumbnails/18.jpg)
Annotation in Digital Photos (cont.)
We divided the whole image into rectangular blocks.
For each block, we compute color, texture and shape features.
Each feature vector has 42 dimensions.
18
![Page 19: Semantics In Digital Photos A Contenxtual Analysis](https://reader035.fdocuments.in/reader035/viewer/2022062419/55770b1fd8b42a0e058b5409/html5/thumbnails/19.jpg)
Annotation in Digital Photos (cont.)
The goal is to predict the W associated with an untagged image based on B.
B is the observed variable. The conditional probability of a word
given a set of blocks.
19
![Page 20: Semantics In Digital Photos A Contenxtual Analysis](https://reader035.fdocuments.in/reader035/viewer/2022062419/55770b1fd8b42a0e058b5409/html5/thumbnails/20.jpg)
Annotation in Digital Photos (cont.)
During clustering process, we learn the optical cluster using an untagged image.
Whenever a new image X comes, we assign it to the cluster Oj having maximum value for P(X|Oj).
The probability of a word given the pixel feature blocks and the optical context information.
20
![Page 21: Semantics In Digital Photos A Contenxtual Analysis](https://reader035.fdocuments.in/reader035/viewer/2022062419/55770b1fd8b42a0e058b5409/html5/thumbnails/21.jpg)
Results
Experiments datasets – Flickr Train Evaluation Test
Performance evaluation Precision recall
The number of correctly tag.
The number of photos annotated with that tag in the real data.
The number of prediction tag.
21
![Page 22: Semantics In Digital Photos A Contenxtual Analysis](https://reader035.fdocuments.in/reader035/viewer/2022062419/55770b1fd8b42a0e058b5409/html5/thumbnails/22.jpg)
Results ( cont. )
Prediction tag – wildlife Optical Context 0.71 Image Features (CRM) 0.16 Thumbnail-Context 0.44
22
![Page 23: Semantics In Digital Photos A Contenxtual Analysis](https://reader035.fdocuments.in/reader035/viewer/2022062419/55770b1fd8b42a0e058b5409/html5/thumbnails/23.jpg)
Using Ontology to Improve Tagging
CIDE word similarity ontology. Wu Palmer distance between two
tags
)()(
)(*2),(
ydxd
pdyxSim
23
![Page 24: Semantics In Digital Photos A Contenxtual Analysis](https://reader035.fdocuments.in/reader035/viewer/2022062419/55770b1fd8b42a0e058b5409/html5/thumbnails/24.jpg)
Using Ontology to Improve Tagging
Shrink this estimate using semantic similarity:
),()1()|()|( WSimIWPIWP MLE
24
![Page 25: Semantics In Digital Photos A Contenxtual Analysis](https://reader035.fdocuments.in/reader035/viewer/2022062419/55770b1fd8b42a0e058b5409/html5/thumbnails/25.jpg)
Results (cont.)
25
![Page 26: Semantics In Digital Photos A Contenxtual Analysis](https://reader035.fdocuments.in/reader035/viewer/2022062419/55770b1fd8b42a0e058b5409/html5/thumbnails/26.jpg)
Conclusion
Optical context data is only a small fraction, which has invaluable information about the photo shooting environment.
Fusing ontological models on semantics about photos also improves precision.
The future work Fuse other types of context with the
context and optical context features.
26