March 11 Lab

Post on 06-Dec-2014

656 views 0 download

Tags:

description

 

Transcript of March 11 Lab

Dissertation IdeasDiffusion of [ ] on online social

network

March – 11 – Lab Meeting

Lexical Diffusion

• A phoneme is modified in a subset of the lexicon, and spreads gradually to other lexical items

• The phonetic law does not affect all items at the same time: some are designed to develop quickly, others remain behind, some offer strong resistance and succeed in turning back any effort at transformation. --Gauchat (cited in Dauzat 1922)

• ……… words change their pronunciations by discrete, perceptual increments (i.e., phonetically abrupt) but severally at a time (i.e., lexically gradual) --Wang and Chen 1977:150.

Exemplar theory and lexical diffusion

• The assumption that people learn phonetic categories by remembering many labeled tokens of these categories explains . . . why leniting historical changes are typically more advanced for high-frequency words than low-frequency words.– -- Pierrehumbert: Exemplar dynamics: word

frequency, lenition and contrast (2000). To appear in J. Bybee and P. Hopper (eds.), Frequency effects and emergent grammar.

Network Study e.g. Castells, 2000, Watts, 2003

• When nodes are connected in networks, they can behave very differently when they are apart e.g. people rioting.

• By organizing nodes in networks, the system as a whole is very resilient and robust (Dijk, 2006), but it also makes outcomes of events very hard to predict or to mould. (Watts, 2003)

Labov: Principles of linguistic change: Social factors

• “leaders of linguistic change are people at the center of the social networks, who other people frequently refer to, with a wider range of social connections than others” (p.356)

• …linguistic change is similar to fashion. Linguistic fashion rests on the notion that such change is motivated by the need to be liked by others as well as the desire to be different

Social contagion: e.g. Watts (2003)

Social contagion of ideas occurs under very specific circumstances. Before an innovative idea is adopted widely, it has to percolate successfully into social clusters. Depending on the connectivity of the first percolating cluster, it may cascade through to parts of the network or even the whole network, in what is usually called a global cascade.

Similar Attempts from Other Fields

• Information meme• Diffusion of Innovation (technology adoption

cycle)• Viral marketing

Information Meme

• Memes are small units of culture/information, analogous to genes, which flow from person to person by copying or imitation.

• Meme-tracking and the Dynamics of the News Cycle

• Identification of topics over time• The evolving practices of bloggers• The cascading adoption of stories• The ideological divisions in the blogosphere

Diffusions of Innovationsby Everett M. Rogers (2003)

• What is Diffusion? It is the processing which an innovation is communicated through certain channels over time among the members of a social system.

Diffusions of Innovationsby Everett M. Rogers (2003)

Four Main Elements in the Diffusion of Innovations

o The Innovation: an idea, practice, or object perceived as new by an individual or other unit of adoption; not necessarily newo Technological Innovations, Information and Uncertainty

• Technology: a design for an instrumental action that reduces the uncertainty in the cause/effect relationship involved in achieving a desired outcome

• Two Components• Hardware: physical object• Software: information

o Technology Clusters: one or more distinguishable elements of technology that are perceived as being close interrelatedo Perceived Attributes of innovations

• Relative Advantage: the degree to which an innovation is perceived as better than the idea it supercedes

• Compatibility: the degree to which an innovation is perceived as being consistent with existing values, past experiences, and needs of potential adopters

• Complexity: the degree to which an innovation is perceived as difficult to understand and use

• Trialability: the degree to which an innovation may be experimented with on a limited basis

• Observability: the degree to which the results of an innovation are visible to others

o Re-invention: the degree to which an innovation is changed or modified by the user in the process of adoption and implementation

Information diffusion in online social networks

• Information diffusion is influenced by network structure

• How does information diffusion shape networks?

Information DiffusionMark Granovetter (1973): The strengh of weak ties

• “The fewer indirect contacts one has the more encapsulated he will be in terms of knowledge of the world beyond his own friendship circle”(Granovetter, 1973, p. 1371)

• Weak tie: occasional communicate • Distance node:

You are likely to already be familiar with the work and ideas of immediate colleagues and friends, but a colleague that you communicate with only occasionally is more likely to be source of novel information.

Twitter: Information Diffusion

• Weak tied: the structure of Twitter’s open, content-centric network enables information diffusion via weak ties.

• Distant nodes: the platform has become a powerful tool for communicating scientific research, scholarship, and innovative ideas beyond one’s immediate peer group.

Twitter: information diffusion

• Different from facebook or linked: not only within friends’ circle.

• Friends could be followers, but followers are not exclusively friends

• Content based not relationship based, topic based too.

• Retweeting, hashtag, open not only twitter user --- fast circulation

• Not entirely closed network: 44.5 million users (June, 2009)

Twitter: user intention

• - Daily Chatter: daily routine and updates• - Conversations: @username• - Sharing info/URLs• - Reporting news

Number of posts as a function of the number of followers: saturation

Twitter: some conventions

• @mentions - following word is the name of a twitter user and as such this tweet refers to that user, e.g. ”@dave thanks for the help” or ”Talking with @paul about twitter”.

• #hashtags –give contextual relevance to a tweet or identify a keyword, e.g. ”Like this demo #acita09” or ”Why does #ms-word keep crashing”

• Retweets -”RT” means ”I am retweeting (copying) something from elsewhere”, e.g. ”RT@john I just saw Madonna” means that I am retweeting theoriginal message from John

Twitter: functions of conventions

• @mentions are used when searching for tweets: a tweet directed to a certain user is known to be different to a general tweet and is processed differently in the API and in most Twitter clients

• #hashtags are most generally for rapid real-time identification of trending topics, such as #michaeljackson or #stimulus

• Retweets are used primarily by users to convey information from one social group to another however they also can be used in machine processing algorithms to determine a number of factors related to the topic of the tweet, the pace and coverage of transmission, the authority of the original tweeter etc.

Twitter: #hashtag-language convention

• #win - means ”a good experience” and maybe be used in the context of customer service, for example ”just got a free book from amazon #win”.

• #fb: cross posting to faceboo• #flight1549

Twitter: #hashtag – topic tracking

• Meta-tag: create topic• http://hashtags.org/• http://www.whatthetrend.com/• http://twubs.com/: It aggregates tweets and

imports pictures to help illuminate the topics being discussed.

• http://tagal.us/: a dictionary for #hashtags

Twitter: #hashtag - community

• Create a #hashtag• Announce it in tweets:

e.g. fellow linguists let’s share news about linguistics by adding #linguistics to our tweets

• Followers automatically get this announcement

Twitter: data acquisition

• API: Application Programming Interface– a defined way for a program to accomplish a task,

usually retrieving or modifying data– Programmers use the Twitter API to make

applications, websites, widgets, and other projects that interact with Twitter

Twitter: data acquisition

• API: detailed information on the users and the list of users each of them were following

• constraint : the number of queries that could be issued in a day was the key limiting artifact in the reach of our crawl

• “public timeline” API: returns a list of the 20 most recent statuses posted totwitter.com by users with custom prole pictures and unrestricted privacy settings.

Proposed crawling

• The dataset (.crawl") gathered by a constrained crawl of the Twitter network, was seeded by collecting the public timeline at four distinct times of day (2:00, 8:00, 14:00, and 20:00) and extracting the users that posted the statuses in these timelines. Each step in the crawl involved collecting details of the current user as well as a partial list of users being followed by the current user.

suggestions

• What questions to ask?• How to organize the data?