A Picture Is Worth A Thousand Questions Docx

27
A picture worth a thousand questions Visualization techniques for social science discovery in computational spaces Howard T. Welser, Thomas M. Lento, Marc A. Smith, Eric Gleave, and Itai Himelboim

description

d

Transcript of A Picture Is Worth A Thousand Questions Docx

Page 1: A Picture Is Worth A Thousand Questions Docx

A picture worth a thousand questions

Visualization techniques for social science discovery in computational spaces

Howard T. Welser, Thomas M. Lento, Marc A.

Smith, Eric Gleave, and Itai Himelboim

Page 2: A Picture Is Worth A Thousand Questions Docx

Introduction

Social life increasingly takes place through

computer mediated interaction systems, and these

systems are growing in terms of diversity of

affordances for action (Gaverm 1991).

'Community Views' tool kit, which integrates

multiple information visualizations and

populates them with data produced by a stream of

multiple years of Usenet message traffic(Smith &

Fiore, 2001).

Page 3: A Picture Is Worth A Thousand Questions Docx

ATTRIBUTES & ANALYSIS OF COMPUTER MEDIATED SOCIAL WORLDS

Attributes of Online Social Systems 1. Interface

It is the tool, which connects between computer and the human. Interfaces for accessing online spaces increasingly include a wide range of media like text, images, and sound) nowdays, it provides various categories in detail. Most online communities can be found through World Wide Web, and this includes either public pages like Wikipedia, or member only access pages like fatsecret. Furthermore, there are both commercial software like World of Warcraft, Second Life, and hardware restrictions. (One of commercial computer program ION requires at least Pentium 3)

Page 4: A Picture Is Worth A Thousand Questions Docx

2. Boundaries

In Usenet, newsgroups act as collections around sets of threaded discussions. The Usenet is composed of newsgroups focused on a range of topics and interests and interconnected with other newsgroups through shared messages. Slashdot web community provide a fixed set of general topical classifications. Wikipedia provide a range of page types with specialized functions articles, discussion pages, community, and infrastructure pages are clearly distinguished. Wikipedia participants are themselbes clustered into classes based on common behaviors and structural connections.

Page 5: A Picture Is Worth A Thousand Questions Docx

3. Thread. It is like constructed messages like

attaching replies after original message. In other words, a thread is a collection of digital objects that refer to one another in a hierarchy. As a good example, we can remind of Cyworld Karma social rating system.

 4. Network of relationships (Il-chon or E-chon) A social network is the pattern of

relationships in a population (see Nadel, 1964; Freeman, 2000; Scott, 2000).

Page 6: A Picture Is Worth A Thousand Questions Docx

4. History of participation( 조회수 , 스크랩 횟수 ) All users of online spaced develop reputational

signals from their history of participation.

5. Representation of identity(expressing ‘self’) These can range from distinctive names,

signature files, avatars, and personal pages, tags, biographical statements, affiliations, journals, images and other files uploaded or linked to the personal pages.

Page 7: A Picture Is Worth A Thousand Questions Docx

7. Dimensions of data contributed(uploading data, and sharing)

Youtube supports video, and Flikr is dedicated to sharing and tagging images. Myspace, FaceBook and Wikipedia upload and provide files that are more personal.

Page 8: A Picture Is Worth A Thousand Questions Docx

Analysis: general considerations

A systemic study of computer-mediated social interaction spaces must consider the dimensions of behavior they contain. Typically, behavioral variables measured as counts, like number of comments or relationships. Correspondingly, wherever appropriate, these measures should be considered in terms of rates of activity, or be standardized or displayed on a log scale. Another consideration is that the boundaries of social action may or may not correspond to boundaries within the online space and may often spread across multiple spaces (Baym, 2007).

Page 9: A Picture Is Worth A Thousand Questions Docx

Systematic differences in how participants contribute can be conceptualized as social roles (Welser, Smith, Gleave, Fisher, 2007)and mapping the distribution of these social roles across community boundaries will suggest the appropriate theoretical framework for modeling the social action that occurs within them(Monge & Contractor, 2003).

Page 10: A Picture Is Worth A Thousand Questions Docx

Two measurement challenges

1)What temporal and behavioural bounds should we place on the definition of tie in a given study?

2)More conceptually, which modes of interaction represent meaningful social relationships within the focal population?

Page 11: A Picture Is Worth A Thousand Questions Docx

Agenda for visualization in social scientific discovery

1.Purposes of applying information visualization to social media

Visualization plays a key step in a larger process of identifying meaningful dimensions of interaction, aggregating actions, and visualizing distributions and relationships in the population. Given the number and fine grained detail of logged events, processing aggregations and making descriptions is a critical methodological and theoretical task (see Welser, Smith, Gleave, & Fisher, 2008).

Page 12: A Picture Is Worth A Thousand Questions Docx

2. Challenges of applying information visualization to social media

Making use of color, shape, size and orientation to map different data dimensions expands the density of data captured in a single class of image. Although such approaches can present a lot of information in a single visualization, eventually images can become too complex or violate rules of visual perception that obscure information rather than contribute to revealing its real character (Tjfte, 1995, 1997).

Page 13: A Picture Is Worth A Thousand Questions Docx

3. Solutions and approaches to effective information visualization(Examples are manyeyes and treemap, which is similar to mindmap.)

Information visualization is a topic of deep complexity (Tufte, 1995; Donath, 1999; Freeman, 2000). A recent critique of network visualizations noted that basix tasks like following the links between any two nodes is often impossible (Shneiderman & Aris, 2006). Promising recent work suggests new approaches to network visualization that combine network with other nodal data to create more informative images*Brades, Raab, & Wagner, 2001; Shneiderman & Aris, 2006). Semantic substrates (Shneiderman & Aris, 2006) are a method to project network visualizations into meaningful spatial containers. A semantic substrate could be a non-geographic map like a treemap that clusters nodes by attributes other that their connections to other nodes (Shneiderman, 2004)

Page 14: A Picture Is Worth A Thousand Questions Docx

VISUALIZATIONS FOR DISCOVERY

Mapping Boundaries and Hierarchies: Treemaps and Graphs

Figure 10.1 depicts a treemap (Shneiderman, 2004; Smith & Fiore, 2001) of posts to Usenet newsgroups under the Microsoft.public hierarchy [Microsoft. Public.excel,… excel.programming, stc.] for 2001 ( see Turner, Smith, Fisher, & Welser, 2005).

Page 15: A Picture Is Worth A Thousand Questions Docx

Treemaps like these can be applied to other content-oriented sites like Slashdot, Wikipedia, And most topic specific web forums. Here we shift our focus to mapping nested invitation relationships in the social network and blogging site Wallop. Figure 10.2 displays invitation relationships both as a hierarchy and as a graph.

Page 16: A Picture Is Worth A Thousand Questions Docx

In our precious studies of Wallop, we described the rise of different language communities and the contrasts between them (Gu, Johns, Lento, & Smith, 2006; Lento et al., 2006). Figure 10.3 shows the diffusion of invitations across the English and Chinese language communities.

Page 17: A Picture Is Worth A Thousand Questions Docx

The hyperbolic network graph is an effective method for exploring a large network that highlights adjacent nodes while downplaying distant ones (Schaffer et al., 1996). This tool is useful for exploration and provides a helpful addition to classic network visualization software like Pajek (De Nooy, Mrvar, Baragelj & Granovetter, 2005). We found that, as the patterns noted in invitation practices, comment interaction in Wallop had occasional language group crossings, but was generally marked by preference to reply others in the same language (Lento et al., 2006).

The insight that the invitation tree contains more deviations from homophily than one might expect is actually borne out in records of interaction structure (see Lento et al., 2006, figure 5)

Page 18: A Picture Is Worth A Thousand Questions Docx

Comparing patterns of behavioral across different boundaries

The scatter plots in figure 10.4 show the relationship between overall levels of activity (total posts, x-axis) and the size of the community ( number of repliers, y-axis) across several different types of newsgroups.

Page 19: A Picture Is Worth A Thousand Questions Docx

Figure 10.5 is a set of ‘crowd views’ generated from data from a range of Usenet newsgroups. The crowd view is scatter plot with a few additional attributes mapped to the color and size of each glyph. Each crowd view displays a glyph for each author in a newsgroup or other collection of threaded message conversations.

Page 20: A Picture Is Worth A Thousand Questions Docx

MAPPING STRUCTURE OF RELATIONSHIPS WITHIN THREADS AND GROUPS

Computer-mediated social systems are rife with ways to infer social ties from interaction records.

The following network graphs were collected through content analysis of edits to a Wikipedia policy discussion page. These data come from the first archived page of the ‘No personal attacks’ policy (see Black, Welser, DeGroot, & Cosley, 2008).

Page 21: A Picture Is Worth A Thousand Questions Docx

They are valuable at initial steps in exploratory network visualization, but should be augmented with other data, like roles or status ( Brandes, Raub, & Wagner, 2001). Egocentric network graphs based on comment relationships among Wallop users are shown in Figure 10.7. These visualizations are consistent with a general theoretical supposition (McAdam & Paulson, 1993).

Page 22: A Picture Is Worth A Thousand Questions Docx

Characterizing types of actors from histories of contributions and relations

Figure 10.8 is a revealing triptych for discerning roles from threaded discussion, especially tailored to distinguish the role of expert (or ‘answer person’) from that of other common participants. The set includes an ‘authorline’, a longitudinal characterization of the amount of contributions to particular threads while distinguishing between those initiated by ego and those initiated by others (Viegas & Smith, 2004). This set of three visualizations allowed us to identify some of the ket structural signature of experts in online discussion spaces (Welser et al., 2007)

Page 23: A Picture Is Worth A Thousand Questions Docx

DISCUSSION

Assessing visualizations: building better pictures and better picture production systems

Tufte (1995, 1997) and many others have pointed out standards for high quality ‘final product’ visualizations. Two of our studies (Lento et al., 2006; Welser et al., 2007) illustrate this three stage process

(1) Production of visualizations of relationships and behavior to gain insight into patterns and develop hypotheses

(2) Operationalize visualization patterns as metrics and variables for use in a statistical model.

(3) Communication of model results

Page 24: A Picture Is Worth A Thousand Questions Docx

Ultimately, the best assessment of exploratory visualizations and systems for producing those visualizations is the predictive power of those models and the theoretical significance of these findings.

Page 25: A Picture Is Worth A Thousand Questions Docx

Further challenges in extending visualization techniques to complex data1)Attempting to squeeze more information

into a single visualization becomes counterproductive. By grouping similar measures, eliminating irrelevant information, and recombining different relationships into aggregated measures, it is sometimes possible to represent complex data effectively in a simple visualization.

2)Another approach is to represent complex data through comparison of clearly related images like moving images or presenting multiple representations of different subsets of the data in a single visualization.

Page 26: A Picture Is Worth A Thousand Questions Docx

CONCLUSIONS

There is much room for progress. First, we recognize limitations that stem from the need for additional methods of inquiry, like ethnographic study, statistical testing, and experimental research in order to understand social dynamics more. We also recognize a need for greater development of tools for the efficient creation of precisely turned sets of visualizations. Finally, we hope to see visualization strategies extended across wider ranges of comparable situations.

Page 27: A Picture Is Worth A Thousand Questions Docx

Thanks!!