An evaluation of a new type of plastic-coated PIT tag for tagging
Improving Personal Tagging Consistency Through Visualization Of Tag
-
Upload
qin-gao -
Category
Technology
-
view
108 -
download
5
description
Transcript of Improving Personal Tagging Consistency Through Visualization Of Tag
Improving Personal Tagging Consistency through
Visualization of Tag Relevancy
Dr. Qin Gao*, Yusen Dai, and Kai FuInstitute of Human Factors & ErgonomicsDept. of Industrial Engineering, Tsinghua
University
HCI International 200919-24 July 09, San Diego, CA, USA
Content
Introduction
Conclusion
Research Question
Methodology
Results & Discussion
Tagging consistency is important for users to organize things effectively and to retrieve them efficiently later on.
Tag A Tag B
Dr. Qin Gao, Institute of Human Factors & ErgonomicsDept. of Industrial Engineering, Tsinghua University
Introduction
Tagging has emerged as a new means of information organization and retrieval
Tagging is easy to use, flexible, able to harvest the intelligence of the crowdBut there are many inconsistencies in tagging systems!
Content 1
Content 4
Content 3
Content 2
Tag 1
Tag 4
Tag 3
Tag 2
Tripartite model of tagging system, from Halpin, Robu, & Sherpherd, 2007
Dr. Qin Gao, Institute of Human Factors & ErgonomicsDept. of Industrial Engineering, Tsinghua University
Introduction
Vocabulary problems“Bad” tags: misspelt tags, badely encoded tags, mixed use of singulars and plurals, and etc.Inevitable semantic inconsistency: polysemy, synonym, and basic level variations. (Golder & Huberman 2006)
Consistency between taggersThe extent to which different users agree on selection for certain tags for specific content.
Allowing a true representation of knowledge and multiple interpretations of the same content.Trends towards stabilization (Golder & Huberman, 2005)
Consistency within individual taggersThe extent to which individual users agree on selection for certain tags for specific content at different point in time.
Dr. Qin Gao, Institute of Human Factors & ErgonomicsDept. of Industrial Engineering, Tsinghua University
Introduction
Consistency within individual taggers is important to individual users and to the system.
Affecting efficiency of information organization and retrieval tasks for individual users
Organizing information is one of the most motivation for tagging (Ames and Naaman, 2007; Marlow, et al., 2006).Indexing research shows that reliance on consistently used indexing cues is desired for effective access of information
Impacts on users’ perceived usefulness of the system and their satisfaction.
How to improve individual tagging consistency?Providing tag suggestions based on existing tagging pattern can shape users’ tagging behavior (Sen et al, 2006; Binkowski, 2006)
How to present such suggestions?
How to select tags for suggestion?
Dr. Qin Gao, Institute of Human Factors & ErgonomicsDept. of Industrial Engineering, Tsinghua University
Visualization of Tags
The first generation of tag clouds
Tag popularity is represented by visual cues
The second generation of tag clouds
Semantic relations among tags is revealed by visualization
Semantically clustering of tags by Montero & Solana (2006)Tag clouds from Amazon, from Bateman 2007
Nielson, 2007
Dr. Qin Gao, Institute of Human Factors & ErgonomicsDept. of Industrial Engineering, Tsinghua University
Research Question
Goal of the study: to examine the effect of tag frequency visualization and semantically clustering on users’ tagging consistency
Hypothesis 1: visualization of occurrence frequency of tags improves personal tag consistency and reduces users’ workload.
Hypothesis 2: visualization of inter-tag relevancy improves personal tag consistency.
Dr. Qin Gao, Institute of Human Factors & ErgonomicsDept. of Industrial Engineering, Tsinghua University
Methodology
2*2 experiment design
Dr. Qin Gao, Institute of Human Factors & ErgonomicsDept. of Industrial Engineering, Tsinghua University
Methodology
Frequency visualization by font size
Font size level
Font size (px)
1 122 203 284 365 446 527 60
the font size was determined by the following logarithm function
6log( )1
log(120)i
i
OCurrent
Currenti is the font size level of the current tagOi is the use frequency of the current tag
The relationship between font size level and tag frequency
Definition of font size levels
Dr. Qin Gao, Institute of Human Factors & ErgonomicsDept. of Industrial Engineering, Tsinghua University
Methodology
Visualization of tag relevancy – Semantically clustering
Clusters of relevant tags were calculated based on co-occurrence similarity with K-means algorithm developed by Montero and Solana (2006).
The approach was proved to reduce semantically density of tag clouds significantly.
ti=(d1i, d2i, d3i,
…, dni)
Definition of the vector space:ti=(d1i, d2i, … dni)cosine (t1, t2)=(t1·t2)/‖t1‖*‖t2‖
Dr. Qin Gao, Institute of Human Factors & ErgonomicsDept. of Industrial Engineering, Tsinghua University
Methodology
Dependent variablesTagging consistency
Let Ai and Bi denote the sets of tags that assigned to the same document in two sessions, then tagging consistency with this document:
The overall tagging consistency:
Workload measured by NASA-TLX
| |( , )
| |i i
ii i
A BO A B
A B
1( , )
n
iiO A B
On
Let
and
in two sessions
Ai and Bi denote the sets of tags that assigned to the same document in two different tagging sessions
in two sessions
Dr. Qin Gao, Institute of Human Factors & ErgonomicsDept. of Industrial Engineering, Tsinghua University
Methodology
Stimuli100 pictures selected from Flickr, tagged as “nature”, “city”, or “people”20 were stimuli, and other 80 were filler pictures
Participants40 participants, including 10 females and 30 males, aged from 20 to 31All are experienced tagging users
ProcedureTwo tagging sessions, with a disruptive interval in between.
Results
Testing of hypothesis 1No frequency
visualization
(N=20)
With frequency
visualization (N=20)
F(1,36)p
M SD M SD
No. of tags in 1st
session
47.3 15.35 46.3 19.48 χ2 = 0.12a .73
No. of tags in 2nd
session
46.4 13.20 46.7 18.39 <0.01 .95
Consistency 0.72 0.116 0.69 0.145 0.78 .38
Workload
Mental demand 42.0 16.98 53.0 15.55 χ2 = 4.09 a .04*
Physical demand 33.9 18.97 22.4 15.84 χ2 = 4.08 a .04*
Temporal demand 32.3 17.86 44.8 20.10 χ2 = 2.93 a .08
Performance 40.4 19.85 35.87 21.89 0.49 .49
Effort 59.4 19.06 62.0 21.28 χ2 = 0.37 a .54
Frustration level 22.3 22.45 32.3 24.03 χ2 = 2.14 a .14
Global 43.0 12.73 46.0 12.35 0.52 .47
aKruskal-Wallis-test.*Significant differences at p<.05
aKruskal-Wallis-test.*Significant differences at p<.05
aKruskal-Wallis-test.*Significant differences at p<.05
aKruskal-Wallis-test.*Significant differences at p<.05
Dr. Qin Gao, Institute of Human Factors & ErgonomicsDept. of Industrial Engineering, Tsinghua University
Results
Frequency visualization has no significant impact on tagging consistency.Frequency visualization reduces perceived physical demand significantly, but also increases mental demand.
An interaction effect on physical demand (χ2 = 6.4, p = .01)
27.8 27.8
40.0
11.7
0
5
10
15
20
25
30
35
40
45
No Yes
Physical demand
Frequency Visualization
No semantic clustering visualization
With semantic clustering visualization
Results
Testing of Hypothesis 2No clustering
visualization (N=20)
With clustering
visualization (N=20)
F(1,36)p
M SD M SD
No. of tags in 1st
session
49.4 18.65 44.2 15.92 χ2 = 0.84a .36
No. of tags in 2nd
session
48.6 17.80 44.6 13.70 0.62 .43
Consistency 0.67 0.126 0.75 0.127 4.0 .05*
Workload
Mental demand 51.1 16.47 43.8 17.17 χ2 = 2.39a .12
Physical demand 27.8 18.51 28.5 18.37 χ2 = 0.04 a .82
Temporal demand 41.9 20.69 35.2 18.78 χ2 = 1.27 a .26
Performance 32.8 18.28 43.5 22.12 2.81 .10
Effort 63.3 17.32 58.2 22.50 χ2 = 0.24 a .62
Frustration level 26.8 20.66 27.8 26.58 χ2 = 0.10 a .74
Global 45.2 10.62 45.9 14.33 0.09 .76aKruskal-Wallis-test.*Significant differences at p<.05
Dr. Qin Gao, Institute of Human Factors & ErgonomicsDept. of Industrial Engineering, Tsinghua University
Results
Semantically clustering improves personal tagging significantly. H2 was supported.
But no significant difference in workload or the number of tags given by participants.
The consistency level of participants tagging with semantically clustering is 12% higher than that of participants tagging without such visualization.
Dr. Qin Gao, Institute of Human Factors & ErgonomicsDept. of Industrial Engineering, Tsinghua University
Discussion
Two types of tagsGeneral categorical tags, influenced by the basic level
High recall but low accuracyUsers have a strong bias to use them as first tags (Golder & Huberman, 2005).Relatively more consistent.
Descriptive/specific tags, ego-centeredHigh accuracy but low recall Major source of
inconsistencies
All participants expressed their intention to tag consistently, but often failed to do so due to limited memory.
Dr. Qin Gao, Institute of Human Factors & ErgonomicsDept. of Industrial Engineering, Tsinghua University
Discussion
Semantically clustering of tags helps users’ tag formulation tasks and improves their consistency in identifying and deciding on specific tags
It improves the performance of specific search and increase the attention towards tags in small fonts compared to other layouts (Schrammel et al., 2009).
Frequency visualization does not provide support for search of specific tags.
When used in combination with semantically clustering, it help reduce perceived physical demand.
Dr. Qin Gao, Institute of Human Factors & ErgonomicsDept. of Industrial Engineering, Tsinghua University
Conclusion
Visualizing the relevancy among tags has a significant positive effect on tagging consistency, whereas visualizing tagging frequency does not.
Empirical support for the effort of visualizing semantic relationships among tags
When the tag relevancy is visualized, highlight frequently used tags can reduce perceived physical demands; however, it increases perceived mental demands as well.
Implications for professional indexer aid design.