SPIE-Pres
-
Upload
rafael-santos -
Category
Documents
-
view
216 -
download
0
description
Transcript of SPIE-Pres
-
NEURAL NETWORK BASED
VISUALIZATION OF COLLABORATIONS
IN A CITIZEN SCIENCE PROJECT
Alessandra M. M. Morais, Rafael D. C. Santos,
M. Jordan Raddick
-
Motivation: Citizen Science
Neural network based visualization of
collaborations in a citizen science project
-
3
Citizen Science
Use the power of volunteers to gather or process data.
Using idle computer time.
Collecting data.
Using human intelligence.
Not a new concept, but the web made several
interesting projects possible.
-
4
Citizen Science Galaxy Zoo
Volunteers classify images of galaxies.
www.galaxyzoo.org
Part of the Zooniverse www.zooniverse.org
-
5
Citizen Science Galaxy Zoo
150.000 volunteers.
More than 80.000.000 classifications.
60% of the volunteers classified
-
6
Citizen Science
One important issue: data quality.
More collaborators more data better quality?
Better collaborators better quality?
How to identify different types of collaborators
Non-intrusively.
Without positive or negative reinforcement.
Log analysis.
How to identify and motivate certain categories of
users?
-
7
Previous Results
Morais, A. M. M.; Raddick, J.; Santos, R. D.
C.; Visualization and characterization of users
in a citizen science project; SPIE Defense,
Security, and Sensing, 2013
-
The Self-Organizing Map and
Visualization
Neural network based visualization of
collaborations in a citizen science project
-
9
Kohonens SOM
Neural network for unsupervised learning.
Projection of multidimensional data into a lower-
dimensional lattice.
Quantization: one neuron will be associated/associable
with several data vectors.
Projection: data vectors close in the original
multidimensional space will be close in the lattice.
-
10
The Basic Algorithm
-
11
The Basic Algorithm
-
12
SOM and Visualization
We can use the lattice to visualize a large amount of
multidimensional data.
Must choose a proper representation for the neurons.
Must take advantage of quantization and projection.
-
13
SOM and Visualization
-
Icons, Features and Results
Neural network based visualization of
collaborations in a citizen science project
-
15
Icons
Parallel Coordinates will be used to visualize the users.
Simple, uncluttered icons with few dimensions (few attributes).
Each icon represents a prototype vector and the set of data
vectors assignable to that prototype vector.
-
16
Features
Main features:
Participation range p: number of days between first and last
recorded interaction.
Participation count d: number of days of activity.
Maximum classification max in a day.
Total classifications total.
Average of classifications per user average.
Considered only the first 600 days of the participation.
-
17
Features
Features:
a1: p/600 1: long term
a2: d/p 1: frequent during participation
a3: d/600 1: frequent during project
a4: max/total 1: all in a day
a5: total/average 1: close to average user classif.
a6: d visual complement
a7: log10(total) how many classifications
-
18
Visualizing Volunteers Activity
Activities General View Seven Attributes
-
19
Visualizing Volunteers Activity
Activities General View Seven Attributes
Curious: very short
activity interval, very
active in this interval,
did not contribute much.
-
20
Visualizing Volunteers Activity
Activities General View Seven Attributes
Potentials: contributed
sporadically but
significantly.
-
21
Visualizing Volunteers Activity
Activities General View Seven Attributes
Dedicated: contributed
frequently, contributed
a lot.
-
22
Visualizing Volunteers Activity
25% or less of correct classifications
-
23
Visualizing Volunteers Activity
75% or more of correct classifications
-
24
Sessions and Accuracy
Other visualization example:
a1: number of sessions
a2: average session length in seconds
a3: average number of classifications per session
a4: percentage of correct classifications
Session is defined by periods of inactivity (180 seconds)
-
25
Visualizing Volunteers Accuracy
Session data and correct classifications
-
Conclusions
Neural network based visualization of
collaborations in a citizen science project
-
27
Conclusions
Visualization can give insight on data, but
Many methods, many parameters.
Very hard to find a Aha! solution.
Guided visualization for exploratory analysis very useful.
Kohonens Self-Organizing Map is able to do visual, almost-
fuzzy clustering of multidimensional data.