pure.au.dkpure.au.dk/...on...Kasper_Br_db_k_Christensen_Elektron… · Web viewby taking the...

STAKEHOLDER INTELLIGENCE ON SOCIAL MEDIAby

Kasper Brødbæk Christensen

Advisor: Lars Haahr

Cand. ITIT, Kommunikation og Organisation

Aarhus School of Business01/08-2012

Attached to the thesis is a dataset, which can be downloaded through the following link:https://rapidshare.com/files/3479799533/Data.zip

https://rapidshare.com/files/3479799533/Data.zip

Table of contents0. Abstract...................................................................................................................................1

1. Introduction.............................................................................................................................21.1. Problem Statement..................................................................................................................................................................5

2. Problem Area...........................................................................................................................52.1. Discussion: Community or Influence?.............................................................................................................................9

3. More ideas lead to a better end-product: A Collective Intelligence perspective......................123.1. Discussion: Recapitulation................................................................................................................................................17

4. Method and Discussion..........................................................................................................20

5. Stakeholder Engagement as a Collective Intelligence system..................................................22

6. Text Mining to extract information from social media............................................................266.1. Text Mining Basics................................................................................................................................................................286.2. Preparing data for Text Mining.......................................................................................................................................316.3. Categorization of documents............................................................................................................................................336.4. Clustering of documents.....................................................................................................................................................346.5. Text Mining for stakeholder opinion............................................................................................................................36

7. Enter Twitter, “Instantly connect to what’s most important to you.”.....................................377.1. Twitter as a Collective Intelligence System...............................................................................................................38

8. Stakeholder Intelligence on Twitter........................................................................................43

9. The case of a Communications Manager at Novo Nordisk......................................................479.1. CSR-Communication on Twitter.....................................................................................................................................499.2. Establishing a business case (domain)........................................................................................................................519.3. Selection of stakeholders (balance diversity and expertise).............................................................................529.4. Initial analysis of information quality..........................................................................................................................54

10. Evaluation of results and model...........................................................................................63

11. Conclusion...........................................................................................................................70

12. Bibliography.........................................................................................................................7112.1. Articles (Order of Appearance)....................................................................................................................................7112.2. Books (Order of Appearance)........................................................................................................................................7212.3. Links (Order of Appearance).........................................................................................................................................7312.4. Programs Used (Order of Appearance).....................................................................................................................75

0. AbstractIn this thesis we have through a review of theoretical perspectives analyzed the possibilities

of a model for stakeholder engagement on social media. We took our outset in stakeholder

theory and brought into discussion two logics of engagement, the logic of influence and

community respectively. We found further inspiration for the proposed model in collective

intelligence, business intelligence and text mining theory, which we discussed in relation to

the two logics of engagement. Our analysis resulted in a model, where stakeholder

engagements on social media could be conceptualized as the establishment of a collective

intelligence-system. With this we found support for the argument that stakeholder

engagement, as a discipline which seeks to listen to and learn from stakeholders, can be taken

to social media. When taking this view communications from stakeholders on social media

becomes information that may aid a company in daily decision-making processes. In order to

obtain this information we look to the text mining discipline and here we found, given the

nature of text mining and social media respectively, that it may be necessary to narrow down

the purpose before applying the model. We find that properties of Twitter as a social

technology may support such information-gathering activities especially well. Our case relates

to the position of a Communications Manager at Novo Nordisk and the dataset applies to his

position alone. Upon applying the model on the case of his work and the data we found a

collective of stakeholders communicating largely about the same issues. However, we found

only indications of such activity and were not able to derive from our dataset information of a

quality with which we could qualify decisions. We suggest that this is attributed to the nature

of the dataset and the scope of the text mining capability in this thesis more than a failure of

the proposed model.

We end the thesis with a discussion of the inherent challenges related to social media and

therefore also the model. We find that when engaging in social media to find information

there are challenges relating to the nature of online identities, as well as the information

disseminated by those online identities. Furthermore, we discuss the consequences of

gathering information in such a way in relation to stakeholder engagement. We end by

concluding that there are challenges left to overcome but that the model may yet be

applicable.

1

1. IntroductionIn 1979 Michael E. Porter presented a framework for analysis of factors affecting the

capabilities of a company’s strategic development. Porter’s Five Forces have become a

mainstay of business theory, and are thought by Porter to arise from the inherent

competitiveness in a company’s industrial environment. (Porter, 1979) The relevance of such

an analysis of forces impacting a company and its strategic development persists to this day,

and is perhaps more salient than ever as we have unequivocally entered into the millennium

of globalization. Whether Porter had envisioned the intensely competitive nature of the

twenty-first century is difficult to assess but while the concept of analyzing the forces of your

environment remains relevant today, the question is if industry competition can still suffice to

describe what affects companies.

The threat of external forces impacting on a company’s activities is now more than ever a

reality. Globalization has brought with it an intensified threat of new entrants and substitute

products (Porter, p. 141, 1979), but perhaps the most significant change to the external

influences has come about with the increased focus on the ethical and moral responsibilities

of companies. Corporate Social Responsibility, while not exactly a new concept, surely within

recent years has seen an increased focus in both the minds of political leaders and common

people.

R. Edward Freeman (1984), the Father of stakeholder theory, included the concept as a way to

describe that companies carry responsibilities beyond that of accountability to shareholders.

(Freeman, p. 38-40, 1984) As mentioned, today there is a much-increased focus on the

responsibilities of companies and as such especially Porters “Bargaining Power of Customers”

has increased dramatically. Perhaps one might rightly suggest that today a more fitting

description of this force would be the “bargaining power of stakeholders”, encompassing any

and all who are affected by or take interest in the activities of a given company.

“Corporate social responsibility is often looked at as an "add on" to "business as usual," and

the phrase often heard from executives is "corporate social responsibility is fine, if you can

afford it…Given the turbulence that business organizations are currently facing and the

very nature of the external environment, as consisting of economic and socio-political

2

forces, there is a need for conceptual schemata which analyze these forces in an integrative

fashion." (Freeman, p. 40, 1984)

It should be no surprise that a company’s activity generally tends to lean towards making a

profit, and by that logic it seems somewhat rational to narrow our attention on e.g. opinions of

shareholders. This concept of broadening the spectrum of a company’s responsibilities go

back decades but today the attention and perceived importance of such dealings have no

doubt increased dramatically. So what changed? The contention of many authors (Li &

Bernoff, 2008; Benett, 2003; Castelló et al., Forthcoming; Scherer & Palazzo, 2011) is that the

modern era of digitalization has brought about changes in the power relations between

stakeholders and companies. The proliferation of digital communication and information

enabled especially by the coming of Web 2.0 technologies has been seen as a threat to

companies across the globe. This is not the prediction of some fortune cookie; it is the reality

that surrounds us. Stakeholders today are able to gather information, analyze it, form an

opinion (sometimes well-founded, sometimes not) and disseminate it in a digital space, where

potentially millions of stakeholders sit in wait to consume it. Some present such opinions in

the form of an opinionated blog, some as a status update on either Twitter or Facebook and

some as an informative video. Facebook now has over 900 million users, Twitter over 500

million users and this is exactly what has changed, the proliferation of social media use. In

2012 communication about anything and everything is running rampant, and where

companies in the past may have had a say in what the newspaper printed, management of

such content is today at best an illusory concept.

Perhaps this serves as an explanation of why companies slowly but surely have adopted the

use of social media. In November 2011 the McKinsey Global Institute carried out a survey

asking 4.261 global executives about their adoption of social technologies and the perceived

benefits gained from the adoption. They showed that of the corporations involved 72% have

adopted at least one social technology into their efforts. However, only 1.949 of the

respondents reported at least one measurable benefit, which may speak to the fact that

obtaining benefits from efforts on social media is a difficult task. (Bughin, Byers & Chui, 2011)

3

The motives for companies engaging in social media are no doubt many. Some might be

engaging to manage the threat of having no presence, and thereby no chance of exerting any

control whatsoever, while others may be engaging to exploit the opportunities presented by

the technology. In this thesis we treat the developments within the last ten years partially as a

threat and partially as an opportunity. Part threat because the brand nature of companies is

sensitive to information that give them a bad name and as we have outlined this is now

difficult to control. Part opportunity because we believe that the correct approach to engaging

in social media provides unprecedented potential for connecting with stakeholders in a way

that may strengthen relationships. This is the crux of the discussions and perspectives

presented in this thesis. Engaging in social media with the express purpose of connecting with

what stakeholders, listening to what they have to say, and from that derive which areas a

company may focus on to increase value.

We take our starting point in contemporary discussions within the field of stakeholder

engagement, highlighting two competing logics, the logic of influence and the logic of

community. From this we derive the concepts we believe may fit when the prospect is to take

stakeholder engagement to social media. We analyze the practical applicability of these

perspectives and find that social media is of such nature that another supporting perspective

is needed. This takes us into the field of theory related to collective intelligence, which might

aid us in deriving value from social media by conceptualizing it as a place where ideas and

solutions are generated each and every day. We couple these perspectives with those of

stakeholder theory in order to find support for a model that in practice may leverage the use

of social media as a source for information. To harvest such information we include

methodology from the increasingly recognized area of text mining. In order to make an

attempt at applying the model we derive we focus in on a single social technology, namely

Twitter, and case material provided to us by a Communications Manager at the department

for Corporate Sustainability at Novo Nordisk. In the case we analyze the capability of our

model in relation to the position of this manager by applying text mining methods on 7763

tweets from 58 accounts on Twitter. We conclude the thesis by evaluation of our results and

our model in order to assess whether the unified perspectives from theory may be brought

into business practice. We summarize the project in our thesis in the following statement.

4

1.1. Problem Statement

This thesis should be seen as an attempt to unify theoretical aspirations and capabilities of

stakeholder theory and collective intelligence respectively in order to conceptualize a model

for bringing stakeholder engagement to social media.

2. Problem AreaWhile we touch on and allow inspiration to flow from many fields of study throughout the

thesis (e.g. collective intelligence, social media, business intelligence, text mining and

stakeholder theory), relations between people and corporations seem inseparable from the

field of stakeholder engagement. As such moving toward the betterment of corporate efforts

within this business discipline becomes our primary focus and the locus of our analysis.

The rising attendance on social media seems a self-perpetuating effect as people tell their

friends, they tell their friends and so on. As demonstrated in a five-wave study published in

2011 the last wave where 37.600 people globally were polled showed considerable

attendance on social media. 61% answered that within six months they managed a profile on

a social media site, while 64% read blogs and perhaps just as exciting 75% answered that they

visited company/brand websites. In sharp contrast to this, the first wave of the study four

years earlier only 27% of respondents had created a profile on a social media site. (Hutton &

Fosdick, 2011) This is interesting because provides some proof of the proliferation of social

media use among stakeholders, while at the same time establishing that companies are within

the realm of stakeholder interests online.

As such it might even seem a natural development that stakeholder engagement is moving

toward initiatives on social media. However, as with any initiative the road toward

implementation is paved with challenges. As presented by Castelló, Etter and Morsing

(Forthcoming) in a study of a company’s assessment of the possibilities of taking stakeholder

engagement to social media, two competing logics of engagement are highlighted. In this they

focus heavily on the managerial and institutional challenges of communication on social

media as a part of stakeholder engagement. (Castelló et al., p. 1, Fortcoming) In the context of

this thesis we derive from this article perspectives in these two logics and treat them as a

foundation for analysis and discussions, which may aid us in highlighting the difficulty as well

5

as the perceived value in moving from a traditional view of engagements to one where

engagements happen on social media. This will help us assess what the challenges are from a

corporate perspective, and the discussion has served as great inspiration for our perspective.

The following descriptions of the logics as they present them are a derivation of their study of

a single company and we will seek to further qualify that these fit the contemporary

perceptions within the field.

The logic of influence (Castelló et al., p. 15-17, Forthcoming)

- Influence: The company seeks to influence stakeholder opinion through their

engagements with the purpose of preventing conflict, reducing risks and gaining

knowledge from key stakeholders. E.g. a company wanting to erect a wind turbine to

decrease their energy consumption costs may meet resistance from locals in the area

where the turbine is to be. They may then attempt at engaging in dialogue with the

locals with the purpose of reaching a compromise or solution agreeable to both parties.

- Firm centered: The company decides what is and what is not a good topic for

engagement, and the selection of who to include. Not that stakeholders have no say in

this but different topics are analyzed and prioritized in accordance with internal

perceptions of importance.

- Contract based: The engagements are organized around hierarchical processes and

rules. What this means is that the engagements are subject to internal regulation of

employees, and while this is somewhat of a broad description, it stands to reason that

some companies regulate at least part of what their employees can and cannot discuss

in public.

- Face-to-face: The ideal and largely preferred method for engagement is described as

face-to-face. The reasoning behind this is not explicitly defined but a reasonable

suggestion seems to be as Ikujiro Nonaka describes it, that tacit knowledge may only

be made explicit through a process of externalization.1

1 In his article, ”Organizational Knowledge Creation” of 1997 the process of externalization is as described by Nonaka the process by which one person transfers her tacit knowledge to another person. He stresses dialogue through face-to-face interaction as a means to this end.

6

We include into our considerations the best practice descriptions delivered by the

organization AccountAbility, “Since 1995, AccountAbility has been focused on “mainstreaming”

sustainability into business thinking and practice. Our widely-used AA1000 standards, leading-

edge research, and strategic advisory services help organisations become more accountable,

responsible, and sustainable.”2. What is interesting about these standards is that many of the

descriptions correlate directly to the concepts of the logic of influence, while at the same time

presenting descriptions that seem to support arguments for the logic of community.

(AccountAbility, 2011) We start by correlating AccountAbility standards to the logic of

influence and then do the same when we have presented the concepts of the logic of

community.

“Stakeholder engagement then is the process used by an organisation to engage relevant

stakeholders for a clear purpose to achieve accepted outcomes.” (AccountAbility, p. 6,

2011)

The above citation alone may lead one to think that they argue for the logic of influence. It is

about including relevant stakeholders with a clear purpose in mind to achieve only accepted

outcomes. It seems plausible to suggest that this reiterates the statement that engagements

are firm-centered as well as contract-based. While it is stressed that the owners of the

engagement must include stakeholders in the definition of the purpose they go on to describe

the importance of carefully considering who needs to be involved. (AccountAbility, p. 22-24,

2011) Describing this as a paradox may be over the top but it seems easily imaginable that if

the company decides on whom to include, they are at least in part also in control of the

purpose and the outcome.

The problem in relation to integrating social media into the engagements may, with these

descriptions in mind, be as Dellacoras (2003) describes it that the volatility and

unpredictability of the communication makes it very difficult to assess the outcome of the

engagement. This seems a given due to the sheer volume of communication happening daily.

A company might then have a very clear purpose when engaging on social media but how

2 Taken from www.AccountAbility.org, the official website of the organisation.

7

http://www.AccountAbility.org/

would one predict the outcome when anyone can join the conversation? We return to this

discussion in section 2.1.

The logic of community

(Castelló et al., p. 17-18, Forthcoming)

- Collective interest: The company seeks to engage in dialogue on social media,

encouraging a broader spectrum of stakeholders to participate in conversations and

thereby enhancing inclusivity.

- Topic centered: As inclusivity and dialogue increases and more stakeholders join the

conversation controlling the topic of each engagement becomes an arduous task. As

such the logic of community represents an engagement logic, which allows the topic

for discussion to be spawned by stakeholder interest and not company prioritization.

- Participation: Not only does it encourage increased participation among stakeholders

but also among employees. They argue that means should be established for each

member of a company to participate to increase visibility.

- Network: When including more and more stakeholders into engagement efforts

recognizing that this enables multiple conversations across space and time boundaries.

Most interesting to us here will be that it seems the perception of an engagement now focuses

on including as many stakeholders as possible, and letting them decide what is an interesting

topic of discussion. Li and Bernoff (2008) speak to the same issues and although the angle is

different the message is seemingly the same and quite clear:

“…So work on both fronts in your company – muster up the humility to listen and tap into

the skill to take what you’ve heard and make improvements. That’s embracing the

groundswell, and it pays by shortening the distance between you and your next successful

innovation.” (Li & Bernoff, chap. 10, 2008, n.p.)

To clarify, the groundswell is a broad definition encompassing any and all members present

on the sum total of all social technologies on the web. (Li & Bernoff, chap. 1, 2008, n.p.) They

continuously stress the fact that it is the stakeholders in the groundswell, and not the

companies, who are in control and encourage companies to alleviate this threat through e.g.

8

acts of listening in on and talking to the groundswell. (Li & Bernoff, chap. 5-6, 2008, n.p.)

These concepts are fairly self-explanatory and we do not wish to dwell on these. But if the

purpose, as it seems to be, is to see stakeholders and communications on social media as

valuable resources that disseminate information usable in both innovation and relationship-

building (Li & Bernoff, chap. 4, 2008, n.p.), then the view seems to correlate strongly to the

logic of community. As mentioned the angle is different, their focus lies in their contention

that if you have a brand that you wish to maintain or develop, you’re under threat from the

groundswell.

“If you have a brand, you’re under threat. Your customers have always had an idea about

what your brand signifies, an idea that may vary from the image you are projecting. Now

they’re talking to each other about that idea. They are redefining for themselves the brand

you spent millions of dollars, or even hundreds of millions of dollars, creating.” (Li &

Bernoff, chap. 1, 2008, n.p.)

If this is indeed the case, which a myriad of examples in their book demonstrate and

stakeholder engagement is at least in part about building brand trust and value, then it may

suggest that engagements on social media in todays world are an absolute necessity. In the

section to come we discuss the elements of the two perspectives with the purpose of

uncovering why, in the case of engagements on social media, the logic of influence is not

suitable and what challenges remain in regards to the application of the logic of community in

the same respect.

2.1. Discussion: Community or Influence?

Taking into consideration the descriptions presented in the previous section it should be clear

that there are contradictions between the two logics. First, it seems reasonable to suggest that

there is a considerable shift in perspective when going from one where the purpose of the

engagement is to influence stakeholder opinion, to one where the essential question is: “What

is the opinion of the stakeholder?”. Second, another considerable shift occurs when the issue

to be addressed by an engagement removes itself from company control and ends in

stakeholder control. If we process these shifts in an idealized way one might conclude that a

company uncritically must listen to opinions, and move to engage itself in the issues

9

expressed by those opinions with little regard for relevance. This is most probably an

exaggeration of the intentions behind this perspective. However, if we take engagements to

social media and encourage anyone to join and speak to issues important to them, all the

while knowing that the technology is molded in such a way that we cannot assess who they

are and what they stand for (Dellacoras, p. 1410, 2003), then dissecting value of opinion

relative to the company seems very difficult.

Todays companies are without a doubt highly professionalized, competition demands it.

Information drives decisions, and as such good decisions derive from good sources of

information. This may demonstrate, as the logic of influence seems to argue, that carefully

assessing which stakeholders to include is a rational choice. E.g. a patient suffering from high

blood pressure may be of very little value in evaluating the relative efficiency of a medicines

chemical synthesis, conversely she may be able to deliver valuable insight into the effects that

synthesis has on a human body. As such one might rightly suggest that she would be a

valuable resource in an engagement where the topic is the one, but not the other.

This leaves us in somewhat of a dilemma at least if the project of stakeholder engagement

remains as described by AccountAbility:

“They then discover that it (stakeholder engagement) can contribute just as much to

strategic as to operational improvement. Engagement can be a tremendous source of

innovation and new partnerships. Leading companies are discovering that a growing

percentage of innovation is coming from outside the organisation and not from within.

They realise that stakeholders are a resource and not simply an irritant to be ‘managed’.”

(AccountAbility, p. 8, 2011)

It might seem now that the influence logic prevails in its considerations and as such might be

the best for social media as well. However, social media does not facilitate face-to-face

interaction and it seems highly unlikely that they could ever be contract-based if we cannot

predict who joins the discussion. We might have a stakeholder of malicious intent joining the

discussion and purposefully providing false or misleading information, which may lead to an

10

unintended outcome. It seems that from this discussion we might rightly ask the question:

“What is the purpose of taking stakeholder engagement to social media?”

If it is purely supposed to be about communicating with more stakeholders, and this is seen as

a good in itself, then it seems we may allow ourselves to be less critical of who joins the

conversation and who does not. But what sort of value does this bring into the company? How

do you measure the effects of a perceived positive interaction on social media? As a report by

Hypatia Research (2011) suggests, these questions remain under scrutiny by professionals

within the companies. They report that challenges to investing in social media among others

are lack of standard ROI (return on investment) metrics, which is under heavy debate3, and

lack of business case goals. (Hypatia Research LLC, p. 4-5, 2011) This lust for predictability is

certainly understandable but the question remains if it is attainable on social media. Lastly, if

stakeholder engagement is about solving issues in cooperation then simple judgments of

efficiency should lead one to the thinking that social media is not suited for stakeholder

engagement. However, in such a view it seems that stakeholder engagement is at least in part

about treating stakeholders as a source of knowledge and if this is so then we should treat it

as such. Doing this we believe may demand a different, although conceptually similar,

perspective.

In the coming section we lay out a theoretical framework which we believe provides strong

support for the argument that communications on social media, when approached correctly,

may deliver value and help guide decisions within the company. As our area of interest is the

stakeholder engagement discipline, collective intelligence theory might be just the perspective

we need to discover the information-potential of communications on social media.

3. More ideas lead to a better end-product: A Collective Intelligence

perspectiveOne might find the prospect of basing decisions on information gathered from

communications on social media a strange one, no less as we have just stated that we believe

3 Debate over whether to use financial ROI metrics or non-monetary metrics. (Hypatia Research LLC, p. 4-5, 2011)

11

companies to be highly professionalized. How could we conceive that it might at all be

valuable to listen in on social media when the company most likely already has its internal

experts, or information-workers, tackling whatever information-demand might arise? Can it

be valuable at all?

Generally speaking, collective intelligence is about people in cooperation or perhaps even in

contest to solve a problem or come up with an idea, which lines up fairly well with general

presumptions of the benefits of stakeholder engagement. It promotes a perspective where if

we wish to solve a problem or come up with an idea, we may harness the power of a given

collective to find better solutions and better ideas. Such descriptions might provide us with

some insight into why stakeholder engagement is generally a good idea, but it also outlines an

incipient foundation for channeling engagements into social media. We will spend some time

assessing the theoretical notions and then through discussion address the perceived

compatibility with stakeholder engagement as a business discipline.

The concept of Collective Intelligence stems from years of research in different fields of study

e.g. biology and computer science (Leimeister, p. 245, 2010). It seems to describe a reality in

which inclusivity as a bearing principle promoted by the community logic might actually be a

reasonable one.

“A group of average people can – under certain conditions – achieve better results than any

individual of the group. This seems to hold even if one member of the group is more

intelligent than the rest of the group.” (Leimeister, p. 245, 2010)

First off, the term collective describes a group of individuals who may be but are not

necessarily of the same opinion on a given subject. Leimeister (2010) describes that this gives

rise to the possibility of revealing different opinions leading to the establishment of a more

nuanced perspective on a given task, which will then lead to better solutions. The term

intelligence refers to the ability to use ones existing knowledge to learn, understand and adapt

to an environment so as to be able to handle a wide variety of situations. (Leimeister, p. 245,

2010) In other words an intelligent collective in this view may describe many different

compositions of people put together deliberately or by coincidence to solve a task which has

12

their interest. Such a composition would then be referred to as a collective intelligence system

(CI-system).

At the MIT Center for Collective Intelligence a study in 2009 sought to uncover the building

blocks of CI-systems from examples of what they called Web enabled collective intelligence.

(Malone, Laubacher & Dellacoras, p. 2, 2009) They asked the following questions in the

process (Malone et al., p. 3, 2009):

- Who is performing the task? Why are they doing it?

- What is being accomplished? How is it being done?

From these they found that they were able to describe such a system through a set of genes,

which enabled them to answer the above questions. Of course this with the purpose of being

able to recreate such a system in any setting fitting for such activity and outline what they

believe to be happening in such systems. (Malone et al., p. 3, 2009)

Our point of departure is social media and as such we look for descriptions that fit a platform,

where no direct hierarchical structures exist. That is to say if we imagine social media as a

place where daily ideas are generated and problems are solved, then it is an important

distinction that this is an autonomous effect and not one which arises because someone is in

control saying, “Solve this problem” or “Come up with an idea for this”. (Li & Bernoff, chap. 1,

2008, n.p.) This of course is not a claim that no such structures do exist on social media at all,

they might, but in its essence it is an open environment where users are free to focus on what

they want. With this in mind, if we are to see social media as a CI-system, then the MIT study

describes this as the crowd gene, which serves as an answer to who is performing the task. As

they say, in the crowd gene, activities can be undertaken by anyone in a large group who

chooses to do so, without being assigned by someone in a position of authority. (Malone et al., p.

4, 2009)

Now one might wonder what would motivate people to join in on such activities. People

would probably be quick to say that if asked to spend time solving a problem, some sort of

compensation would be required. However, the study found that the motivation for engaging

in such a system might be more than just financial gain. They found three genes, which

describe motivations for engagement: money, love and glory. (Malone et al., p. 5, 2009) Money

13

is as is and the fact that this motivates people should not be a surprise to anyone. It may very

well be less obvious how the two remaining genes can describe motivation. Love as they

define it is either the intrinsic enjoyment of an activity; the joy of socializing with others and

the feeling of contributing to a cause and as such people who engage with such motivations

may not necessarily require financial compensation. (Malone et al., p. 5, 2009) E.g. a person

might work as a teacher but in her free time be totally absorbed by gourmet cooking. It seems

very reasonable to suggest that such a person would engage in activities, be it online

discussions or cooking for friends, without a prospect of financial gain. Glory as a motivating

gene is also an important one. It describes that people will engage in activities if they believe

they may be recognized for doing so. (Malone et al., p. 5, 2009) This also correlates to the

previous example. She might cook for her friends for the love of cooking but might also do so

because she knows they will think highly of her cooking skills if she does it well.

So we seem to have established that CI-systems might arise anywhere and with the bearing

foundation being a range of different motivations. We then move on to the question of what is

being done in such systems. Researchers at MIT found two genes pertaining to this, what they

call the create and the decide gene. The create gene describes an actor in the system who

brings something new to the table, and the decide gene describes an actor who evaluates and

selects alternatives. (Malone et al., p. 5-6, 2009) Many examples of such activities already exist

e.g. when Google went looking for a new logo for their Google Chrome page they asked users

of the community to create their suggestion of what this logo should look like with the

message “Let your creative juices flow, #chromies!”.4 One could conceive of such a request as

the formation of a CI-system, where perhaps a graphical designer who finds great enjoyment

in such an activity might submit his proposal not for financial gain but for the recognition of

having her logo on a page owned by Google. In this the create gene would be activated by the

designer and the decide gene by those evaluating which design is the best fit for the page.

The answer to the last question of how it is being done relates to the create and decide genes.

They describe two ways in which the create gene is activated, either through collection or

collaboration. The way they separate themselves from each other is the way in which

contributions are generated. Collection describes a process where an actor creates an item for

4 https://plus.google.com/u/0/100585555255542998765/posts/h7LNZ8zUAdF

14

https://plus.google.com/u/0/100585555255542998765/posts/h7LNZ8zUAdF

contribution independently from other items. They found that a property of this process is

that contributions may be viewed as being in contest with each other. Collaboration then

describes a process where actors work together to create an item for contribution. (Malone et

al., p. 6-7, 2009) Our previous example from Google could describe creation through the

collection gene, where users submit their proposals in a contest for recognition. And the

activities leading to the conception of different articles on Wikipedia, where users work on the

same articles to ensure their quality would be an example of the collaboration gene.

“But also for companies there are various new potentials for improving their creativity and

innovation capabilities. The challenge is to understand how to unleash the vastly unused

knowledge or experience of their employees, customers, or partners, and thus leveraging

their inherent collective intelligence.” (Leimeister, p. 246, 2010)

This all sounds quite wonderful but before lunging ourselves into the formation of such a

system we remember Leimeister (2010) point out that such a system will be better only under

certain conditions. Earlier we spoke briefly about the fact that good decisions demand good

information and as such if we were to tap into a CI-system with the purpose of generating

support for decisions within the company, it would seem wise to attempt to assess the quality

of the information generated by the collective. Bonabeau (2009) speaks to this issue and

states that a detriment is that our basic human nature can lead us astray when we’re making

important decisions. (Bonabeau, p. 47, 2009) We tend to favor information that fits our

current beliefs, be untrusting of information which speaks to the contrary and let ourselves be

influenced by how the information is presented. (Bonabeau, p. 46, 2009) While this may serve

as a good argument for why companies should seek information in their external environment

to strengthen their decisions, it also sets the requirement for ways to escape these bias-

distortions.

Bonabeau (2009) suggests a framework that may help diminish these tendencies and thereby

strengthen decisions. Outreach is one aspect of this framework and simply states that value

can be obtained through increasing the number of contributing individuals in the system.

Another aspect is that of additive aggregation the concept of which is to collect information

from a large number of sources, and perform some kind of averaging to make the information

15

collected from the system more reliable. (Bonabeau, p. 47, 2009) Lastly, he states that self-

organization is an important aspect. Mechanisms must be in place that allow for actors in the

system to interact, which he states is what allows the whole to be more than the sum of its

parts. (Bonabeau, p. 48, 2009)

Bonabeau (2009) then goes on to outline important considerations when attempting to

establish a CI-system. We will outline a few of these considerations most important to our

continuing project. First off, there must be a balance between diversity and in-depth expertise

when considering who to include in such a system. Diversity may lead to different

perspectives and a lot of different solutions, which is indeed seen as a good, but if no one

actually has any knowledge of the area the system is supposed to operate within, then it

seems unlikely that it will deliver additional value. Furthermore, the company must prepare

itself to lose control which poses predictability and liability issues. (Bonabeau, p. 48-49, 2009)

We comment on this later in section 3.1.

To sum up, CI-systems when founded correctly, with the proper precautions and

considerations taken into account, are able to deliver great value to a company seeking

decision-support in its external environment. In fact, as we have seen this may help a

company make better decisions by diminishing bias-distortion. As such, this seems to

correlate quite well with the assumption that tapping into stakeholder opinion with the

purpose of generating better solutions may be very valuable to a company. In the coming

section we discuss how this perspective correlates with the theoretical standpoints of

stakeholder theory and begin to outline our proposal that stakeholder engagement, when

conceived of as a CI-system, may fit social media very well.

3.1. Discussion: Recapitulation

Outreach promotes inclusivity

Both perspectives seem to promote inclusivity as a resource in itself. It seems an overarching

principle is that the more eyes we have on a task, the better the solution generated in

response to this will be. This seems to lend credence to the argument that we should include

more stakeholders in our pursuit of better decision-making. In the perspective of the logic of

16

community, more people is a benefit because it will allow us to establish stronger

relationships with a broader spectrum of our stakeholders. So should we move to include

everyone who wants to when we have topic for discussion? This brings us to our next

discussion.

Diversity is great but expertise is a must

The influence logic states that only key stakeholders should be included in the engagement.

Whether this is due to the fact that proponents of this logic find the inclusion of more diverse

groups of stakeholders irrelevant or too costly, or it is the contention that topics of

engagement are complex topics and demand professional insight, is hard to say. We can

however, with collective intelligence in mind, say that there needs to be some sight of who we

include into our engagement efforts. Let us assume for a moment that social media as a whole

is a CI-system. Undoubtedly, with what we know of social media this would most probably

end in a case of too much diversity but does this fact disqualify it or does it perhaps, as

Hypatia Research suggests, stress the fact that there needs to be a business case? (Hypatia

Research LLC, p. 4-5, 2011) In so far as having a business case means us zoning in on a specific

process of decision-making. E.g. if we are looking to establish a CI-system for enhanced

innovation capability in relation to a specific product-line, then the system should hold

stakeholders who discuss this topic. Exactly what constitutes the correct balance is not

explicitly defined and we return to this discussion continuously throughout the thesis.

It seems that the well-known concept of segmentation is prevalent in this way of thinking. Li

and Bernoff (2008) also argue for this when they suggest that a company’s successful

entrance into social media hinges on their ability to develop what they call a social

technographics profile (Li & Bernoff, chap. 3, 2008, n.p.):

“To truly understand the groundswell, you need to dissect and quantify the dynamics that

separate different participants. Why? Because a strategy that treats everyone alike will

spell failure – people aren’t alike and won’t respond in the same way.” (Li & Bernoff, chap.

3, 2008, n.p.)

17

Another argument for segmentation can be found within a philosophical debate which has

been going on in recent years. We include here two considerations expressed in this debate.

The first is the concept of ubiquitous expertise, which speaks to the fact that any person may

be considered an expert by merit of having lived a life in a given context. E.g. almost all of us

can be seen as experts by being able to speak our native language and this applies in other

aspects of life as well. As such, people who we would not normally consider to possess

valuable insight into a given field of knowledge might yet carry a perspective we ourselves

have not considered. (Collins & Evans, p. 16, 2007) However, they also point out that this does

not merit the idea that we should consider everyone’s judgments to be equally valuable, when

looking for insights we should be considerate of the fact that there are different levels of

expertise.5 (Collins & Evans, p. 14, 2007) We include these considerations because it seems to

stress the fact that in any case where we wish to include stakeholder insight into our decision-

making processes, we must make an attempt at assessing what we actually can learn from

those involved.

Social media is self-organization

Another aspect of securing a successful CI-system is to allow for it to be self-organized. The

argument here seems to be that when interaction among stakeholders in the system is

enabled it will create additional value. This concept may seem somewhat vague and as such

we allow ourselves an attempt at an interpretation. In the context of social media interaction

may speak to mechanisms that allow stakeholders to evaluate, learn from and comment on

each other’s contributions. Through this interaction actors may receive comments on their

contributions, which may help them visualize the strengths and weaknesses of their own

standpoints and aid them in the further development of these.

This also seems to correlate with the company’s release of control, which as shown by

Bonabeau (2009) they must prepare for. Social media may very well be perceived as a self-

organizing whole since there are no explicit hierarchical structures. When you create an

account on e.g. Twitter you will have the same options as when a company creates an account.

Even though a company is most probably more acknowledged and easily recognizable,

5 Please refer to Appendix-1 for The Periodic Table of Expertises as presented by Collins and Evans.

18

principally you will be able to participate in the same discussions as they do. With this in mind

it would seem that social media is able to live up to this criteria as well.

Additive aggregation may find topics for engagement

One of the ideals of the logic of community is for the topic of the engagement to be spawned

by stakeholder interest. We note that this is not a claim that the concept of additive

aggregation is the same but it seems to provide a perspective on how a company might find

those topics. It seems unlikely that we could find it rational to base topics for engagement on

what a few stakeholders feel is important. However, if we instead employ some form of

averaging by analyzing what a larger number of stakeholders have to say about a given topic,

then this might provide strength to the decision that it would be a good topic for engagement.

To conclude this recapitulation we find that there are arguments speaking to the benefit of

taking stakeholder engagement to social media. Social media will with its nature provide us

with the possibilities of expanding our concept of inclusion but as we have hopefully made

clear, there should be with regard for relevance. It may be an ideal that we would be able to

respond and assess any issue brought forward by any stakeholder, but the sheer manpower

required to do so coupled with increasing odds of ending up tackling irrelevancies seems to

speak against this. As the theory of collective intelligence suggests, we must aim to find the

right balance between diversity and expertise. However, this should not excuse the choice to

continue keeping everything within company walls. With what we have shown it seems that

the logic of influence in its ideal may suffer from bias-distortions due to the fact that nearly

everything is decided internally. Taking a level-headed look at this, with the many diverse

groups of stakeholders today interested in company activities how could we possibly justify

only considering a few? Whether the motivation is to tackle the threat of not having a

presence online or it springs from genuine interest in cooperation, it seems the arguments

predominantly speak in favor of tapping into the groundswell. In the coming section we take a

step back and clarify how we got here, what merits our contentions so far and throughout the

rest of the thesis.

19

4. Method and DiscussionThe considerations and proposals presented in this thesis are largely based on a theoretical

review of contemporary discussions regarding social media, stakeholder engagement and

collective intelligence. The mass proliferation of social media use has fundamentally changed

the way in which we communicate, who we can communicate with and also what information

is available to us on a daily basis. Scrolling down your Facebook stream, how much

information will you find that you would not otherwise have found? We are now more

connected with the world than ever before. There may be numerous reasons for why

companies have chosen to suit up and participate. It may be that it provides a generous

platform for low-cost marketing initiatives, or as Li and Bernoff (2008) claim because brands

are under serious threat from stakeholders interacting online.

Our interest in stakeholder engagement sprung mainly from study by Castello, Morsing and

Etter (Forthcoming), which showed that the discipline is moving toward social media.

Although here they showed that the goal was mainly communicating with more stakeholders,

the argument for which seems to be increased visibility. (Castello et al., p. 22, Forthcoming)

Visibility is of course beneficial in so far as it enables more people to get familiar with a brand,

and the fact that a stakeholder is able to have a conversation with a representative of that

brand may help in building trust. However, this “communication-alone” approach has led

some professionals to scrutinize the perceived ROI in social media. Some have proposed

annual customer satisfaction and retention as measures of ROI (Hypatia Research LLC, p. 4-5,

2011), and while we do not question that these measures may be relevant, it does seem hard

to objectively assess the causal relationship between these and interactions on social media. If

we have more satisfied customers how would we know whether this is due to a great product

or a positive interaction on social media? The same goes for sales. An employees great

handling of a stakeholder on social media may very well lead to a sale but to be sure of this,

we would need assess if that stakeholder goes on to buy something online. Not to mention

that the stakeholder might just as well go to a physical store and buy a product, in which case

tracking would be even harder. There may very well be companies who have succeeded in

figuring this out but it does speak to the challenges of investing in social media.

20

When we in the coming section move toward our own proposal of a way to engage in social

media it becomes important to clarify that the proposals are based primarily on our review of

literature. Not much, at least to our knowledge, has been written on the relationship between

social media and stakeholder engagement, or the practicality of engagements on social media

in general.

“However, the foundational literature in stakeholder engagement ill prepares us for a

world of networked societies with “geographically distributed cognition” and globalized

relations.” (Castello et al., p. 2, Forthcoming)

The consequence of this in the context of this thesis is that we have attempted through our

theoretical review to find support for the argument that stakeholder engagement as a

discipline which seeks to learn from stakeholders to find solutions to prevalent issues, can be

taken to social media. There may be a myriad of challenges pertaining to this and we do not

claim to possess the insight to cover all of them (to name a few; legal, organizational and

financial constraints may be barriers to entrance), as such we will not propose that our model

will fit everyone. It has instead become our project to derive the best possible model which

complies with the collection of theory presented in the thesis. In other words, this thesis is in

its essence an attempt to convert theoretical aspirations to practical applications. This also

serves to note that we remove ourselves from any scientific claims and focus solely on a

model, which may allow companies to engage now. However, when we are proposing the

application of a CI-system, where in essence social interaction is producing an outcome, it may

be relevant to portray our view of how such outcomes might be founded. Here we apply a

social constructionist view that anything constructed in such a system will be contingent on

the social reality in which it was created. (Hacking, p. 11, 1999) This is not to say that the

ideas a CI-system produces cannot be relevant or usable elsewhere but it underlines a view

that when asking humans to produce knowledge, we need consider the context. E.g. if we have

two groups of 20 people discussing the construction of the perfect car, in the one group they

may agree that the color red is the perfect choice, while in the other the color chosen is blue.

As such claims of generalizability and objectivity of the knowledge produced in such a system

would most likely demand supporting information. The need for generalizability and

21

objectivity would most likely hinge on the company in question and which decisions the

knowledge derived is meant to support.

As we move forward we start by presenting core principles which we believe may assist

companies in conceptualizing what a model for stakeholder engagement on social media

might look like. We then present what we believe to be a method fit for gathering information

from stakeholders on social media. We then go on to discuss how these principles might be

applied in practice in so far as we make a specific choice of a social technology, namely

Twitter, which will allow us to discuss in detail whether such principles can be supported by

functionality. Furthermore, we analyze claims from the field of Business Intelligence in order

to bring the model into implementation before applying it on a specific business case: The

case of how the model might aid a Communications Manager at Novo Nordisk. In this we will

relate our model specifically to his position in the company and the company in general,

which we find poses interesting challenges in relation to regulatory constraints. We end the

thesis by discussing the constraints of our model in general and in relation to the case, where

we bring up relevant debates in relation to social media in general.

5. Stakeholder Engagement as a Collective Intelligence systemWhen we cross-reference the guiding principles presented in stakeholder engagement and

collective intelligence theory respectively, we seem to have found some stables. First off, we

seem to have established that there are benefits to including more stakeholders in our

engagements, while at the same time noting that this does not mean that anyone and

everyone will be relevant candidates. We believe that viewing the stakeholder engagement

discipline as one, which is constantly trying to tap into the opinions and proposed solutions

produced in a CI-system may be highly beneficial. However, as we have also noted there is a

demand for some skepticism when doing so. With this in mind we derive the first carrying

principle of our model.

Include many but do so with care

Li and Bernoff (2008), AccountAbility (2011) and Bonabeau (2009) all propose a sort of

segmentation as a necessary mean to ensure effectiveness. We therefore find it reasonable to

22

adhere to this principle and propose that when engaging in social media to “listen in” we

should include as many stakeholders as we can find, who can by us be perceived to be

relevant candidates. We remember here that a balance is needed between the level of

diversity and expertise in the system. How we might secure such a balance is not explicitly

defined and as such we allow for conceptualization through the following example. E.g. if we

imagine that we are a pharmaceutical company this might be done by making sure that in the

system we include common people, patients, NGOs, advocacy groups, competitors, news

sources and perhaps even employees alike. Even more categories might exist relative to the

company in question but the overarching principle is to allow for different perspectives on the

same topics, which then may provide a more complete picture of what the prevalent issues

are that we need to respond to.

One might be tempted to question whether these stakeholders are at all present on social

media. It is a difficult question to answer but if we attempt at an assessment according to the

above example common people might be a description encapsulating a category of users too

broad to say something general about. There may be a myriad of motives for people to create

an account on a social media site and as such the perceived usefulness of such a stakeholder in

the system would most likely demand knowledge of the person behind the screen. In turn, it

seems reasonable to suggest that we may benefit from listening to what non-governmental

organizations are saying as they work professionally with areas like securing the

environment, human rights and the like. We note here that it is of course hard to say with any

certainty that the NGOs relevant to your company will be on social media. However, there is

some indication that NGOs have a presence: If we look to www.wefollow.com, which is a site

listing top twitter accounts in relation to number of followers6 and through this site the same

indication can be found for advocacy groups.7 The belief that competitors are on social media

might be qualified partly through the McKinsey study which we presented in the introduction,

where they showed that 72% of respondents had employed at least one technology.

Furthermore, in a study of the Fortune 500 Barnes and Andonian (2011) concluded the

following:

6 http://wefollow.com/twitter/ngo will show that there are many pages of accounts affiliated with various forms of NGOs. This of course is no objective claim but merely an indication that may lead us to believe that they are present on social media.7 http://wefollow.com/twitter/advocacy

23

http://wefollow.com/twitter/advocacy

http://wefollow.com/twitter/ngo

http://www.wefollow.com/

“Three hundred eight (62%) of the 2011 F500 have corporate Twitter accounts with a

tweet in the past thirty days.” (Barnes & Andonian, p. 6, 2011)

All of this of course does not cement the fact that those relevant to establishing your specific

CI-system are out there, however, it does at least confirm that for some, they will be. Finding

the right candidates and the right balance between diversity and expertise will no doubt

demand some research to be done before a choice is made.

Now let them decide

Additive aggregation is as mentioned in section 3.1 one of the properties of the framework,

which may enable a CI-system to deliver value. In relation to stakeholder engagement it might

speak to a perspective on how we may rightly decide what is important. We take in the sum

total of opinions and solutions proposed by stakeholders in the system and perform some

kind of averaging on this to qualify what is important. If we put this in the context of the

inclusivity principle as proposed by the logic of community it seems that even though we do

not include everyone we may be moving toward a more inclusive engagement strategy. At

least if this is as it seems about connecting to more stakeholders with the purpose of getting a

broader picture of what is important for us as a company to focus on. In other words we allow

for topic-centered instead of firm-centered engagements. It is the contention of the collective

intelligence theory that more proposals will lead to what one might call a better end-product.

As such basing what issues we choose to take seriously on the opinions of diverse groups of

stakeholders might allow our engagements to yield better results overall.

This means that stakeholder opinion now must decide relevant topics and as such relates to

what Bonabeau (2009) terms the loss of control. However, as we discussed in section 3.1, if

the system is to be self-organized to create additional value it seems a necessary evil. But if we

imagine having picked out who we find relevant to listen to and allow these stakeholders to

carry out business as usual by not interfering, then we have at least in part made sure that the

issues we find through additive aggregation are genuine. In so far as it is untouched by the

company’s bias. This is not to say that we claim to completely escape bias-distortions. As we

noted when presenting the logic of influence, if we decide who joins we also in part decide

24

what comes out of it. However, this may be as far as we can go to resolve bias-distortions

while still maintaining sight for relevancy.

Decide on a business case before engaging

We have talked a lot about how securing a balance between diversity and expertise is a

bearing principle of CI-systems. It seems reasonable to suggest that we then will have to

decide beforehand what it is we want to gain more insight into. E.g. how would we define an

expert if we did not have a domain within which this is expressed? As stakeholder

engagement is related to the concept of corporate social responsibility, and this is as Freeman

(1984) noted related to economic and socio-political forces it seems that it must tackle many

different areas. These are broad descriptions of course and as such one might imagine that

establishing a CI-system that cares for all the intricacies included in these forces would be a

difficult task. Especially if we are make sure that the actors we include in the system have

knowledge of the area we want to look into.

Establishing a business case also relates to the general difficulties revolving around a

company’s investment in social media. We mentioned in section 1 that only about half of the

respondents in the McKinsey study reported at least one benefit from engaging in social

media, while Hypatia Research proposed the lack hereof as one of the main challenges

pertaining to company difficulties related to evaluating return on investment. (Hypatia

Research LLC, p. 4-5, 2011) Naturally, we do not assume with any certainty that the

establishment of a business case will ease this burden but perhaps it may be easier to evaluate

the return when we draw out information instead of assuming that dialogue alone is the goal.

E.g. the business professionals using the information might be able to tell us whether or not

the information is helpful. We provide further argumentation for the establishment of a

business case in section 6 and section 8.

As we said in section 4 the purpose of this was to outline the core principles of our

perspective on what may be a helpful model for taking engagements to social media. There

are some intricacies left to cover before we make an attempt at applying the model on our

case. One of which we will look to in the coming section, namely, the method by which we

would be able to perform some form of additive aggregation on the opinions of the

25

stakeholders in our system. We look here to a discipline which has received a lot of well-

earned attention within recent years. Text mining cannot be said to be new in general, but it is

most probably a new concept to many companies. We start by providing an introduction to

the discipline relating it to the field Business Intelligence and the tool data mining. We present

the basic terminology of text mining and two methods, categorization and clustering

respectively before we, through discussion, analyze and qualify why this fits our model.

6. Text Mining to extract information from social mediaH. P. Luhn in 1958 was one of the first to use the term Business Intelligence in his article A

Business Intelligence System. (Luhn, p. 314, 1958) However, it was not before the rise of

information-technology that the discipline truly took off. Since then the discipline has

received ever-increasing recognition for its capabilities of delivering value to a company. The

ability of business intelligence and data mining to analyze raw data, and from that derive

information to be used within the company has been praised time and time again. Data mining

concerns itself with what we would call structured data, or, data which lends itself to

tabularization. This is because when we think of data in this context we are typically referring

to data which exists in a database where single data elements are stored in tuples, which

represent a specific fact. A sale of a shirt would when stored in a database be set in a pre-

existing structure designed for a sale i.e. the structure could contain elements which refer to

the quantity, ID, place and time of a sale. These types of data elements are of course in

themselves valuable as they e.g. may help us keep track of how many shirts we have sold.

However, when storing data in structures like these has become popular it is not so much

because it is a convenient way of storing it. It is because when set in such structures data

mining, through its methods, is able to inform us about unknown patterns in our data, thereby

allowing us to get more information out of it than would previously have been possible.

(Berry & Linoff, 2004)

Text mining, text data mining or text analytics roughly refers to the same process. The main

contrast between the two disciplines, which actually employ many of the same methodologies,

lies within the fact that text mining concerns itself with unstructured data. This is termed as

such because in text mining we look to analyze textual data from a given language. You might

26

already be able to imagine how intensely difficult it would be to set up a pre-defined structure

to encapsulate the elements of a natural language, which is most likely the main reasoning

behind the term unstructured. And as such is of course not a claim that a language has no

structure at all. (Feldman & Sanger, p. 1, 2007)

Text mining as a discipline stems from a variety of fields which have been in the grasp of

scientists (it has especially been employed to analyze massive amounts of biomedical

literature) for quite some time. Natural language processing (NLP) is an important part of text

mining and is a discipline which in very broad terms can be said to revolve around handling

language through a computer (or allowing computers to understand and process natural

language). (Feldman & Sanger, 2007, n.p.) As should be commonly known anything a

computer is able to handle, from the very complex low-level programs that allow it to start up

to high-level applications such as Microsoft Word, needs some kind of structure. This is a

given because it is our instructions which allow it to function and as such if we do not hold the

knowledge, neither does the computer. We mentioned a contrast between data mining and

text mining in so far as the data, when we collect it, is quite different. However, just as it is the

purpose of data mining to uncover previously unknown patterns in sets of data it is the

purpose of text mining as well. And in doing so it derives much of its methods from the data

mining discipline and as such this field could also be said to be part of text mining. (Feldman &

Sanger, p. 1, 2007) In the context of this thesis, text mining is especially interesting, which we

attribute to the fact that almost every form of communication on social media presents itself

in the form of text.

”Even so, the very volume of comments out there is a vast source of information. And that’s

the second problem. Volume. There’s so much information flowing out of the groundswell,

it’s like watching a thousand television channels at once. To make sense of it, you need to

apply some technology, boiling down the chatter to a manageable stream of insights.” (Li &

Bernoff, chap. 10, 2008, n.p.)

Luckily for us, when applied with care and thought, this is exactly what text mining is capable

of doing. In an effort to qualify this statement we will spend some time taking you through a

basic terminology for text mining. We do this in an attempt to introduce the reader to the

27

domain and hopefully in the process reveal how information might be derived from analysis

of textual data. This we believe may clarify how text mining may serve as our vehicle for

performing additive aggregation on stakeholder communications on social media, and as such

stays congruent with our project.

6.1. Text Mining Basics

There are a lot of different presentations of what text mining is and what it is capable of. As

such attempting to derive an explanation of text mining from different sources may lead to

some confusion. Therefore, in an effort to stay consistent and avoid misinterpretation we

derive our taxonomy from Feldman and Sanger8 (2007) alone. What we present throughout

this section is but a small part of the total number of the methods text mining has to offer. We

have chosen to include the most important distinctions and will later, in sections 6.3 and 6.4,

present the methods by which we will attempt to analyze communications in our case.

Common text mining terms and their definitions

We have a bit of ground to cover so in an effort to provide the reader with a manageable

overview we list each term and its definition consecutively along with examples, which serves

to conceptualize text mining as a crucial part of our project. As we mentioned when we work

with data in text mining we typically refer to the data as being unstructured. Much of what we

will present in the following serves to alleviate this by treating textual data conceptually not

as language but as numbers that we then may deliver to statistical methods to uncover

information.

A document (Feldman & Sanger, p. 3, 2007): When we refer to a document in the context of

text mining it can be defined simply as a sequence of text. In this interpretation it may

obviously represent a lot of different items e.g. an e-mail, a book, a blog-entry, a status-update

on Facebook and a tweet on Twitter. It does not necessarily have to be a meaningful text and

as such it rings true that…

8 ”The Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data”, Ronen Feldman & James Sanger, 2007.

28

“Just tried out my new insulin-pen. Much better than its previous models. No more needlestick injuries for me!”

…would qualify as a document and

“Yellow red drove big little home bus station in train.”

…would also qualify as a document.

A document collection (Feldman & Sanger, p. 2, 2007): Our document collection then can be

defined as the sum total of all individual documents in our possession. However, an important

distinction is that we may easily have more than one collection of documents in our

overarching document collection.

Document features (Feldman & Sanger, p. 4-8, 2007): The concept of a document having

features demands a bit more of an in-depth explanation. It relates to the fact that any natural

language contains a vast amount of different words, everyday expressions, commas, dots and

even in many cases two words are interpreted and perceived as being one word. I.e. is the

adjective everyday consisting of one or two words? If questions like these may confuse us

humans then you might imagine how much trouble they would cause a computer. Because of

this we make a distinction between the following document features:

- Characters: Any letter, numeral, special character and perhaps surprising to some, a

white space is also a character in a document.

- Words: Any single token that makes up a word in the context of some natural language.

Interestingly, in this interpretation “everyday” would have two tokens “every” and

“day”.

- Terms: Terms are referred to as words and/or expressions taken from a document and

which we (or a computer) could take to represent the individual document as a whole.

We wish here to refer to our previous example found in our description “A document”.

Here the terms representing the document might be “insulin-pen”, “better”, “models”,

“No” and “injuries”. Without complete knowledge of words and sentence structure in

the document we might still be able to infer that it is about some improvement of a

model of an insulin-pen, which cause no injuries.

29

- Concepts: The definition of concepts may be a bit more abstract since it does not

necessarily relate to words or terms found in the document. Concepts can be said to

characterize a document with a word without the need for reference to a word in the

document. I.e. imagine a short story (very short) about a girl’s tragic loss of her mother

in a car accident. It is entirely possible that the word “love” is not mentioned once in

the document but a concept of the story may still be found to be exactly that. Concepts

are typically found through methods of categorization within the domain of text

mining. We return to and expound upon categorization later in section 6.3.

This process of describing an individual document by its features relates to both the

previously described notion that natural languages contain a vast number of complex

relationships between words9 and the like. Therefore in an effort to focus in on what is most

important to us as well as the task being initiated with text mining we analyze and decide on

the most important features of a document or a document collection. E.g. returning to our

insulin-pen example we found that we could infer meaning from the document without

including certain words and characters. Having done that we would then name this our

representational model for that document and doing so is a crucial part of preparing our

textual data for analytical processing. (Feldman & Sanger, p. 4, 2007)

The domain (Feldman & Sanger, p. 8, 2007): The domain in relation to text mining can

broadly be defined as our area of interest. The area which we through our efforts wish to

know more about or add knowledge to. It seems reasonable to argue that any specific area of

knowledge has a wide range of concepts describing important distinctions, relationships and

terms important to that area. Because of this background knowledge of the domain is often

found to be helpful in discovering new concepts. E.g. if you performed a text mining study on

the total of the documents that deal with text mining you would find that concepts is also

sometimes referred to as keywords, although this is not included in what we have presented

here. Furthermore it seems quite obvious that we would not be able to make the judgment

that concepts and keywords represent the same thing if we did not know of concepts.

9 Currently, the English language holds more than a million words and as such a document in English can potentially hold more than a million different features. Source: http://www.languagemonitor.com/global-english/number-of-words-in-the-english-language-1008879/

30

http://www.languagemonitor.com/global-english/number-of-words-in-the-english-language-1008879/


When we in section 5 claimed the importance of having a business case in mind before we

start to establish a collective intelligence system we presented a couple of arguments.

However, as is hopefully now shown this is also highly related to text mining methodology. It

seems that in order to get the most out of a text mining initiative we require some amount of

knowledge about what it is we are looking for. Because of the many different dimensions of

meaning language can contain, text mining is not a magical device that we may employ to find

every hidden intention contained in the written word. To ensure that the methods we employ

deliver information, and perhaps even that we could conceivably make use of, they need a fair

amount of guidance. In the coming section we look to methods of preparing the data for

analytical processing. We present methods that serve to reduce the aforementioned

dimensionality and convert the individual documents to their representational models.

6.2. Preparing data for Text Mining

There are a myriad of ways to help prepare a document collection for analytical processing.

Some focus on the semantic properties of the documents to find e.g. nouns, verbs and

adjectives. Others focus on breaking up the document into smaller pieces by a process called

tokenization, which treats the document as a continuous stream of characters and then

(dependent on our choice) display it as e.g. sentences, words or even syllables. Most

commonly we would be most interested in finding out which sentences or words are in the

document by the mere rationale that it must be easier to then decipher meaning. It does

depend on the task at hand though. You could also, if it fits the task, stem the words in the

document. Stemming means that we return each word in the document to its simplest form

i.e. by removing ‘ing’ and ‘un’ from words. (Feldman & Sanger, p. 60, 2007) However,

undoubtedly the most important of any of these processes (or more commonly preprocessing

techniques) is the conversion of the textual data into numeral representations. We do this by

interpreting its document as a vector containing features, whereby each unique feature

represents a single dimension in the vector. (Feldman & Sanger, p. 89, 2007) It may be

difficult to see how this understanding applies so we will attempt to show this through the

following example. The sentence: “A dog is an animal. A dog is the best friend of man. A man is

an owner of a dog.” when interpreted as a vector of features looks like this:

31

a an of the is dog man best friend owner animal .4 2 2 1 3 3 2 1 1 1 1 3

or simply

(4,2,2,1,3,3,2,1,1,1,1,3)

In other words we take each feature of the document, count their frequency and from that

form a vector with a dimensionality corresponding to the number of unique features. Most

would probably agree that when we break the sentences apart and depict them as such the

punctuation marks seem less meaningful relative to the words in the document. Some might

even agree that the words a, an, of, the and is are less meaningful. Just as with our insulin-pen

example we might be able to derive the meaning of the document from “dog”, “man”, “best”,

“friend”, “owner” and “animal”. E.g. that a dog is an animal and that a man is an owner, since it

seems hard to conceptualize a dog as being the owner of something. When we have this

interpretation in hand we might even be able to infer that there is some “best friend”-

relationship between the man and the dog or the owner and the animal. Since it is our wish to

reduce the document to its simplest meaningful interpretation we commonly remove those

words (called stop words) we deem to be less meaningful. (Feldman & Sanger, p. 68-69, 2007)

When we have gone through these aforementioned motions pertaining to the preprocessing

of our textual data we can move on to where the fun starts, namely, the application of

methods that will draw out information.

6.3. Categorization of documents

We start with categorization which can broadly be defined as a process of dividing each

document into one or more specific categories based on a judgment of what category a given

document best fits. Categories can be decided on either by reference to the content of the sum

total of the documents within a collection or by reference to some background knowledge

about the area of interest. (Feldman & Sanger, p. 64, 2007) There are a lot of different

proposals for how this can be done, so we outline here some of the basic distinctions in

regards to categorization and then move on to show the method by which we will be

categorizing documents in our case.

32

As we mentioned, dependent on the task at hand, we can decide whether our documents can

belong to one or more categories. If a document can belong to one category only we call this

single-label categorization. In turn if it a document can belong to one or more categories it is

called multilabel categorization. (Feldman & Sanger, p. 67, 2007) I.e. if we are only interested

in finding out how many of our documents are about patients of diabetes then a single-label

approach may be appropriate. If however our interest lies only in documents about patients of

diabetes and their relationship with some specific medicine, then a multilabel approach will

most likely be best. Categorization can also be either document- or category-pivoted, which

merely serves to say whether we are trying to find documents that fit a certain category or all

categories that fit a document. In other words whether we take our outset in the categories or

the documents and this speaks primarily to how the given method of categorization functions.

(Feldman & Sanger, p. 67, 2007) Furthermore, methods may perform either hard or soft

categorization. When hard the division into categories is fully automated, which leads to a

result where a document either belongs or does not belong to a category. Here there is no

ambiguity, no in between. When soft the method will deliver to us a list of possible categories

for each document to which they might belong. The final choice of categories is then in the end

for us to decide. Lastly, this relates to what is termed the categorization status value which in

short is a number between zero and one that in its essence serves to tells us “how much” a

document belongs to a certain category. (Feldman & Sanger, p. 67-68, 2007)

Term Frequency – Inverse Document Frequency (Feldman & Sanger, p. 68, 2007)

Most commonly referred to as the TF-IDF weighting scheme this method seeks to deliver to us

a representation of a given feature’s relevance based on analysis of the document collection as

a whole. In other words TF-IDF calculates a weight for each feature in a document relative to

the sum of that feature’s presence in the collection. The mathematical description of the

method looks like this:

33

It states that we calculate the weight of the word w in the document d by taking the frequency

of that word in the document and multiplying it with the logarithm of the total number of

documents divided by the number of documents containing that word. In other words if we

have a total of 100 documents in our collection and 50 of these contain the word diabetes, and

a single document contains the word diabetes 5 times, we would find that the weight of the

word diabetes in that document would be 1.50515. This number would then be our indication

of the relevance of the word diabetes in relation to our collection of documents. TF-IDF does

this for all features in a document collection and ranks these features according to their

weight. We can then use this as a way to find the most relevant features and from that derive

categories fitting for the document collection, and furthermore it will allow us to detect

documents which are outside the scope of our interest. I.e. if there is no mention of the

‘category’ we are looking for we may decide that the document is not of relevance to us.

6.4. Clustering of documents

The second method we wish focus on is the method of clustering and when defined broadly it

is a process by which we group documents together in so-called clusters based on a

calculation of their similarity. Similarity in the context of text mining is assessed with regard

to the feature-content of the documents in the collection. (Feldman & Sanger, p. 82, 2007)

In order to conceptualize how this is done we refer back to section 6.2 where we noted that

when applying text mining we typically transform documents into a vector in which each

unique feature represents a given dimension. This is highly relevant in the context of methods

of clustering because in order to calculate the relative similarity between each document we

perceive each document as a vector in a given space. There are a couple of things to note

about the method of clustering. To detect similarity between documents we use what is called

the cosine similarity measure and the way in which this calculates similarity is by referring to

the weights calculated by the TF-IDF method and from that find the relative angle or distance

from a document to another based on those weights. In other words based on the relative

composition of word-level features in the documents. (Feldman & Sanger, p. 85, 2007) Based

34

on such a measure of similarity the k-means clustering algorithm then seeks to group similar

documents together and separate those that are not similar. Just as with categorization we

can apply either hard or soft clustering, where again hard clustering means that a document

can belong only to one cluster and soft means they may belong to more than one. (Feldman &

Sanger, p. 85-86, 2007)

“Irrespective of the problem variant, the clustering optimization problems are

computationally very hard.” (Feldman & Sanger, p. 85, 2007)

We had the good fortune of being able to explain in short how the TF-IDF algorithm works

and delivers results, however, the same cannot be said for the cosine similarity measure nor

the K-means clustering algorithm we will be applying later. In any case, as you shall see, we

will not manually be involved in the execution of these algorithms. As such we find it sufficient

that we understand how to make use of it, and possess an understanding of the type of

information it reveals about our document collection. In other words we stick to a level of

conceptual understanding in regards to the algorithms applied in our clustering activity. As

such the following the following example serves to enhance that understanding:

35

This is of course a simplification of what happens when we perform the k-means clustering

activity but if we perceive of this as the space in which our documents are represented as

vectors, we can see that three clusters have been formed. Here we have 19 documents in total

and based on TF-IDF weights and subsequent similarity measure of the distance between

them within this space, we find that 14 documents are deemed to be similar enough to be put

into some cluster, while five documents do not and as such are separated from the clusters.

The red cluster is frequently using the word ‘happy’, the blue uses a mix of ‘happy’ and ‘sad’

and the green is a bit less ‘happy’ than the blue. Such could be the results of a clustering

activity performed on a document collection and it would tell us that some of our documents

have been deemed to have similar content.

6.5. Text Mining for stakeholder opinion

When we have spent all this time explaining in more or less detail concepts and methods

derived from the field of text mining the reason is that to our knowledge this is the best

(perhaps even the only) way to effectively access and decipher communications on social

media. When a user is actively involved in maintaining an account on e.g. Facebook or Twitter

they, in this interpretation, disseminate into the digital space a number of unique documents.

Whether they do so in a commercial respect or just to communicate with friends is in this

context irrelevant. We will be able to apply text mining to find out information about or derive

information from these communications. When we move into the case material we propose

that we may use categorization as a way of making sure that the document collection in

question is actually about our domain of interest. Furthermore, we will propose that we look

to clustering activities as a way of performing additive aggregation on the opinions and

proposed solutions by stakeholders in our system. In other words when we, and this is exactly

our purpose, take the sum total of the communications of stakeholders we have chosen to

include in our system, we may be able to apply these methods to perform the kind of

averaging needed to gain insight into the relative significance of these opinions and solutions.

In the coming section we take a close look at the social technology which we will be focusing

on throughout the rest of the thesis. Until now we have focused mostly on social media in

general terms and we note here that the selection of a specific technology is not a rejection of

36

our belief that the proposed model for engaging in social media can be applied on multiple

technologies. In this we will be outlining the argument for our choice of Twitter and then

move on to analyze the capacity of Twitter as a social technology to meet the demands laid out

in section 5.

7. Enter Twitter, “Instantly connect to what’s most important to you.”Twitter can be described as a technology for communication and information sharing. A

popular term within literature describes it as a microblogging service, which serves to relate

it to the well-known concept of blogging. The term micro figures in this description because of

a restriction the service puts on the amount of characters that can be included in any single

post (or as its called, a tweet). If you want to make use of the service you have to grow

accustomed to formulating your thoughts and opinions in 140 characters or less. (Lovejoy,

Waters & Saxton, p. 313, 2012) Even so it has become one of the largest online social

networks and continues its very rapid growth each and every day. At the time of writing the

service holds more than 628 million unique accounts and every second 12 more accounts are

registered.10

“Social media sites allow for the rapid dissemination of information as well as the rapid

exchange of information. Twitter amplifies the rapidity of the information exchange by

limiting the size of the messages to easily digestible information pieces.” (Lovejoy et al., p.

313, 2012)

This is so because Twitter like most social technologies work in real-time. As soon as you

press the “Tweet”-button the information contained in your message is sent out into the

Twittersphere to millions of potential readers. A common criticism of Twitter as a social

technology is that no meaningful information can be contained in 140 characters (Lovejoy et

al., p. 313, 2012) but as we will see users seem to have found ways to circumvent this, and

quite obviously it has not hindered the adoption of the technology. In order to understand in

detail how Twitter works and how it, as is our wish, may facilitate the establishment of a CI-

system we refer to a study published a few years back by Kwak, Lee, Park and Moon (2010).

The study is by these authors proclaimed to be the first ever to study Twitter in its entirety 10 http://twopcharts.com/twitter500million

37

http://twopcharts.com/twitter500million

and as such conclusions are derived from analysis of 41.7 million user profiles, 1.47 billion

social relations and 106 million tweets. (Kwak et al., p. 1, 2010) In the following we will

highlight some of the conclusions presented in this study along with some presented by Finin,

Java, Song, Tseng (2007) in order to thoroughly understand what Twitter is and how

stakeholders are using it. First, however, we cover the most basic functionalities of Twitter as

a tool for communication and social interaction.

7.1. Twitter as a Collective Intelligence System

When you create an account on Twitter you may provide the service with your full name,

location, web page along with a short biography. Nothing more is needed before you are able

to start sending out your tweets into the Twittersphere. In order for your tweets to actually

reach people there are a few options. You can either spend some time connecting to other

accounts by using Twitters internal search engine to find accounts which speak of topics

important to you. If you have a somewhat clear image of which topics you are interested in

you may use the hashtag function to find these. Lets pretend you were quite interested in

anything having to do with sports you might then type this in the search field.

If you are wondering why both sports and #sports have been typed in this is because Twitter

has the capacity to search on keywords such as ‘sports’, which will return to you all the tweets

containing the word ‘sports’. The so-called hashtag #sports is then another functionality of

Twitter that allows users to tag their tweets so as to pre-emptively categorize their tweets.

(Lovejoy et al., p. 314, 2012) Generally you could interpret a hashtag as a users declaration

that this is the topic her tweet is about. Not that this is strictly upheld; you might compose a

tweet solely of words with a hashtag in front of them. However, a quick search for any

hashtag-topic should return a picture, which makes it seems plausible to suggest that this is

the general consensus.

38

Having done your topic-specific reconnaissance you might then start to delve deeper into the

heart of Twitter by beginning to ‘follow’ other accounts. When you decide to follow an account

this means that in the future you subscribe to a feed from their account, which means that in

the future you will be receiving every new tweet that account sends out. (Kwak et al., p. 1,

2010) From then on these tweets and the tweets from any other account you choose to follow

will be shown on your ‘home’-tab.

Depending on each unique situation an account might choose to reciprocate the act by

following your account as well. This would then constitute a friend-relationship but in its

essence it simply means that this account will be receiving your tweets as well. Another basic

element we will cover before moving on is that of the ‘retweet’, which is a core functionality of

Twitter that establishes the foundation for interaction between accounts. You can direct

messages at specific accounts (using @account), however, when you decide to retweet a tweet

you are effectively copying another accounts tweet and posting it again for your followers to

read. (Kwak et al., p. 1, 2010) E.g. if an account with 10 followers sends out a tweet this would

only be sure to reach 10 other accounts, however, if one of the followers has 50 additional

followers and chooses to retweet it then it is sure to reach 60 accounts. When we see Lovejoy

et al. (2012) in section 7 claiming that information disseminates rapidly on Twitter the

retweet has a huge part in this. We will return to this later in the section.

There are of course many ways of getting into Twitter when you first start out but this short

account will be sufficient for our purpose. Kwak et al. (2010) take a closer look at this

functionality and how users in the Twittersphere are actually employing it. Looking at the

follower-followed relationships on Twitter they found that 77.9% of the time when a user

decides to follow another account it remains a one-way connection, and as such only 22.1% of

relationships on Twitter can be said to be reciprocal. Furthermore, they showed that 67.6% of

users are not being followed by any of those who they have chosen to follow. (Kwak et al., p. 3,

2010) Furthermore, they studied the network properties of Twitter and found that to get

39

from any one given account to another given account there was an average separation of only

4.12 jumps, which in network topology speaks to how many people you would have to get

through to get to a complete stranger. This deviates from classic real-world networks, where

on average six jumps would yield the same result. (Kwak et al., p. 3-4, 2010) They note that for

93.5% of users information needs to travel less than five jumps to go from any one account to

another. What this means is that Twitter is a compact network and because of this

information on Twitter may spread more easily to users outside ‘your own network’. (Kwak et

al., p. 4, 2010) These observations led the researchers to conjecture that for many users

Twitter might be a source of information more than a site for social interaction. (Kwak et al.,

p. 3-4, 2010) Although this is only conjecture it presents an interesting indication in relation

to our project since it is essentially our goal to do just that, namely, find information.

The last observation we wish to include presents itself in the researchers study of the retweet.

Here they found that when any given user sends out a tweet, and another user decides to

retweet it, the original tweet will on average reach 1.000 users irrespective of how many

followers the original author of the tweet had at the time of writing. (Kwak et al., p. 8, 2010)

“Individual users have the power to dictate which information is important and should

spread by the form of retweet, which collectively determines the importance of the original

tweet. In a way we are witnessing the emergence of collective intelligence.” (Kwak et al., p.

8, 2010)

This again may speak to the strength of Twitters functionality in regard to information

dissemination and collection. These observations at the very least grant us an indication that

people may be able to use and may be using Twitter as a source of information. We can stay

happy with the fact that information is spread easily, quickly and perhaps this might even

provide us with an indication that Twitter is especially suited for stakeholder engagement

initiatives seeking to learn from stakeholders. Theoretically the demands are met, since the

core functionality that is given means we will be exposed to more opinions and more

information. Potentially we might even reach an enhanced number of stakeholders with our

communications referring to the compact network structure and the properties of the

retweet. However, to stay congruent with our project we cannot uncritically indulge ourselves

40

in this perception. We need a way to assess the relative relevance of each stakeholder (or

account) included in our system. Undoubtedly, as we discussed in section 5, this will demand

some research to be carried out since effectively all the information we have about a given

user are the descriptions she chose to include when the account was created. In order for us

to have some starting point in this we refer to the following four categories of user intentions

(Finin et al., p. 7-8, 2007):

Daily chatter: Tweets containing information about e.g. daily routines and what a given user is

doing at the time of writing.

Conversations: Tweets containing a given conversation between two accounts, characterized

by the presence of @account in the tweet.

Sharing information/URLs: They characterize a tweet as one that has the purpose of sharing

information if that given tweet contains an URL linking to some source of information.

Reporting news: Tweets containing information about some form of latest news. Could either

be a reiteration of news from an external source or a reference to some news pertaining to

Twitter.

These are obviously quite broad categories and they stem from research carried out in 2007

and we know that Twitter has evolved greatly since then. However, it is to our knowledge the

only study which has attempted to tackle the challenge of deriving categories of user

intentions on Twitter. As we mentioned previously in the section they are meant to serve as a

starting and reference point, which hopefully may aid us in the interpretation inevitably

required to assess whether an account can be deemed suitable for inclusion in a collective

intelligence system. However, what is perhaps a more interesting observation presented by

Finin et al. (2007) is that they were able to find multiple community structures within the

Twitter network e.g. as when they found a community in which the talk was about gaming.

“Based on our study of the communities in Twitter dataset, we observed that this is a

representative community in Twitter network: people in one community have certain

41

common interests and they also share with each other about their personal feeling and

daily experience.” (Finin et al., p. 6, 2007)

It may not immediately be clear why this is interesting, however, if we remember back to

section 5 where we spoke of self-organization as a means to a successful CI-system, it may be

more clear. Self-organization speaks to the demand that interaction need be possible between

the stakeholders we include in the system. This study then reveals that the foundation for

such interaction might well exist on Twitter, that is, if we can find it. Unfortunately it will not

be possible for us in the context of this thesis to carry out the link and network analysis which

could reveal such a community structure. As such we leave it at this mention.

The purpose of this section was to delve deep into the functionalities of Twitter in order to

reveal how the technology works, how people are using it and more importantly how it might

be able to serve as the foundation for the employment of a CI-system. Referring to section 3,

you might even say that the necessary genes are present on Twitter. The crowd-gene is

activated in so far as anyone can join Twitter, the create-gene when a tweet is posted and we

might construe of the functionality behind the retweet as one which supports the decide-gene.

A tweet is a unique contribution and as such the collective-gene is activated, whether the

collaboration-gene is also present through the retweet we leave up to interpretation. We

believe that Twitter has capabilities that other social networks do not. The mere fact that you

can follow someone without that act being reciprocated provides possibilities for gathering

information that other services, where relation is built on a mutual agreement, do not. As we

said at the end of section 6.5 this does not mean that we reject the possibilities of

establishment on other social technologies. They may each hold benefits as well as drawbacks

e.g. it will probably be easier assessing the relevance of an account on Facebook than on

Twitter by sheer accessibility of personal information, which might grant credence to the

information disseminated from the account in question. This question of whether the

information disseminated from an account can be said to be credible is a topic we discuss

later in section 10. From what we have presented in this section, however, it does seem

reasonable to suggest that if it is information we are looking for Twitter might just be an

effective vehicle.

42

In the coming section we take a step back and assess what we now know about text mining

and Twitter respectively in order to correlate these considerations to the core principles

presented in section 5, and lay the finishing touch on the proposed model for taking

stakeholder engagement to social media.

8. Stakeholder Intelligence on TwitterWe have now wandered across multiple fields of theory in an effort to find aspirations and

perspectives that may aid us in taking stakeholder engagement to social media. We looked at

stakeholder theory and uncovered its own perspective on what the prospect might entail. We

hinged on perspectives such as those proposed by the logic of community and found that in

essence the aspiration is to allow for more inclusivity and for engagement-topics to be

spawned in the stakeholder community. Given that these perspectives provide little insight

into how we might bring them into business practice we found further support in collective

intelligence. With this we found support for the benefits that might be derived from including

more stakeholders and allowing them to contribute to solutions by their own admission.

Furthermore, collective intelligence taught us that we would have to somehow assess the

stakeholders we choose to include if we are to produce value from such efforts. We then

moved on to text mining in order to propose a method for extraction of the information

spawned by stakeholders on social media. Lastly, we provided a detailed description of the

activities on the social technology Twitter to propose a place, where we could establish a CI-

system. However, as we described in section 5 we have yet to cover a few aspects before we

can be satisfied that we have something we can bring into business practice.

In section 3 we spoke in short of how the purpose of all of this is to strengthen decision-

support and thereby decision-making in the company. In saying so we are however also

aware that decision-making in a company is far from a static concept. Broadly speaking

everyone can be said to constantly make decisions with immeasurably different types of

information lending support to their choices. This might shed some light on the importance of

zoning in on which decisions we are trying to support with a CI-system, as one decision needs

some information, while another needs some other information. To further support this we

refer to section 6 where we have shown that text mining, to be an effective medium for

43

gathering information, needs a guiding domain. Furthermore, in section 7 we have shown that

Twitter holds more than 500 million different accounts and that tweets posted may contain

everything from daily chatter to news reports. All of this coupled with the points emphasized

in section 5 may bespeak the necessity of having a clear-cut business case (or domain) before

engaging and before we relinquish control.

What we are trying to establish in this thesis might carry many different connotations. Some

might call it a CI-system, some an enhanced focus group. However, if we try to conceptualize

who might be put to the task of establishing it in the context of an actual company it would

most probably be handed to the business intelligence department. As such the field of

business intelligence seems the right place to look for inspiration for how to ‘zone in’. From

this we derive another core principle of the model, which relates to the model as a whole and

how we might best bring the information into the company.

Include the business professional

Imhoff and White (2011) present a TDWI11 Best Practices Report pertaining to the

implementation and execution of business intelligence initiatives in a company. In this report

they introduce the concept of self-service business intelligence, which broadly speaking can

be defined as establishing the institutional and technological capabilities that allow

information workers to decide for themselves what information needs their work entails.

(Imhoff & White, p. 4, 2011) It is an extensive report which goes into great detail in relation to

technological requirements. Here we will focus on the benefits described from providing

information workers with the opportunity to enter into the process of deciding what

information they need.

“To create a sustainable and appropriate self-service BI environment, the implementers

must thoroughly understand the information workers who will be using the environment.

They must understand their motivations, mode of working (e.g., mobile, geographically

dispersed, virtual) and, of course, their technological skill sets.” (Imhoff & White, p. 11,

2011)

11 The Data Warehouse Institute, www.tdwi.org

44

http://www.tdwi.org/

The benefits described from applying such a perspective when bringing information into the

business includes a declining demand for involvement from the IT department in the daily

workings, more satisfaction with the services (and the IT department) from information

workers and that the IT department may become a partner to these instead of a nuisance. The

first and the last benefit described, one might suggest, could relate to the thinking that if we

allow those with information needs to enter into the process of deciding what information is

sent their way, then we might be able to bring about a more sustainable solution. The second

benefit may speak to a perceived satisfaction of being involved in the creation of the processes

that shape one’s daily routines. (Imhoff & White, p. 11, 2011) We will not claim this to be an

exact reiteration of the motives behind such an initiative but if we take a closer look it seems

to promote a perspective, where we as a company take the stand that our employee’s might

be quite knowledgeable about what they need for their work to be carried out. If we perceive

of an employee as a stakeholder then one could suggest, according to our previous

argumentation, that they too could be involved into decision-making with the promise of a

better solution.

Accordingly, we propose to include the business professional in the establishment of a CI-

system. We believe this will provide us with the best possible way of zoning in on what

information we need from the system, and if this does indeed increase employee satisfaction

then we shall gladly reap that benefit as well. Most of all, however, we relate this to the

establishment of a business case and the selection of which stakeholders to include. Imagine

an employee whose daily work revolves around relating to stakeholders to communicate the

company’s position on a given topic to them and learn from their perspectives. Would it not

be reasonable to suggest that this employee would be an invaluable resource in deciding what

information is important, and who might be able to deliver to us valuable information? We

note here that this of course must not lead us to a too narrow selection of stakeholders,

because as Bonabeau (2009) mentioned in section 3 outreach is needed. However, we believe

there is support for the rationale of taking such suggestions and insight into our

considerations.

In order to provide an overview of the steps or processes laid out in this section as well as in

section 5 we include here a graphical representation of our proposed model:

45

Much of what is shown in the model has already been discussed and as such should not need

further explanation. However, when we state that research should be done to find one or

more social technologies this may demand a bit of explanation. This relates to the increasing

capabilities of text mining systems to encompass more than one domain. (Cohen & Hunter,

p.2, 2008) In this thesis we stick to the one. When the model presents a step involving an

initial analysis of the information extracted from the stakeholders selected this relates to an

attempt at ensuring that we do not deploy a system, which delivers information that the

employee in question cannot make use of. Generally, it seems common sense to suggest that

we should first make sure that the stakeholders we have chosen to include also are able to

deliver value before financing the deployment of a long-term system.

In the following section we move into the case material related to this thesis. We had the good

fortune of corresponding with Scott Dille, a Communications Manager from Novo Nordisk

who sits in their department for Corporate Sustainability. We describe Novo Nordisk as a

46

company in relation to stakeholder engagement along with descriptions of the daily work

carried out by the manager. Here we include descriptions of what, in essence, the goal of his

work is and we also describe some of the challenges he is faced with being an employee

communicating on behalf of a large pharmaceutical company. Lastly, we attempt at applying

our own model on this position in this specific context in order to, after we have been through

the case, be able to better evaluate our model.

9. The case of a Communications Manager at Novo NordiskNovo Nordisk A/S is a large European pharmaceutical company with headquarters based in

Denmark, and departments in 75 countries total. The company as it is today came to be

through a corporate merger in 1989 and has since then worked to ensure the progress of the

capabilities within the area of diabetes care and other diseases. (Novo Nordisk 1, 2012)12 In

later years, rising demands for treatment of haemophilia drove the company to establish The

Novo Nordisk Haemophilia Foundation, which as they describe was to underline the

company’s social responsibility within haemophilia care. (Novo Nordisk 2, 2012) This citation is

included because a look at Novo Nordisk’s history tells the story of a company that, from the

get go, has had a stern focus on its societal, scientific and environmental context. Before the

merger Nordisk, as one of the companies involved was called, in 1926 established the Nordisk

Insulin Foundation to support research, and people with diabetes in Scandinavia. The second

company involved, named Novo, in 1951 established the Novo Foundation to support

scientific, social and humanitarian causes. Fast forward to 2006 the company signed an

agreement the World Wide Fund for Nature to become part of the WWF Climate Savers

program, committing to a 10% reduction of the carbon emissions by 2014. (Novo Nordisk 2,

2012) The list could continue on, but we will stop here, and as this information stems from

the company’s own website we take this picture-perfect account with a grain of salt. However,

looking at the facts and the actions carried out through the years most would probably agree

that we are here dealing with a company which stays aware of how it affects its external

environment. This may be supported by the fact that they have been thoroughly recognized

through the years, winning several awards for its performance related to Corporate Social

Responsibility. (Novo Nordisk 2, 2012)

12 The totality of the information presented about Novo Nordisk is taken from their website. Specific links and dates of viewing can be found in the Bibliography.

47

The reason for this short account naturally relates to the fact that we are occupied with the

field of stakeholder engagement, which as described in section 5 invariably relates

maintaining a responsible nature in accordance with the Triple Bottom Line. And the

company is highly focused on cooperation with their stakeholders, who in their own words

include:

“Novo Nordisk's key stakeholders include people with diabetes and others who rely on our

medicines, customers (ie public healthcare providers and payers), employees, investors,

suppliers and other business partners, neighbours, and key publics. For us, the patient is at

the centre – and hence the ultimate stakeholder to which the company must hold itself

accountable.” (Novo Nordisk 3, 2012)

They recognize the benefits of building trust-based relationships and including stakeholders

in the conception of well-founded decisions. However, they also reveal that such relationships

are built on membership and partnership, which a closer look reveals are seemingly

exclusively available to organizations i.e. business associations, advocacy groups and think

tanks. (Novo Nordisk 4, 2012) Naturally, such established affiliations should be held in high

regard but they also have the aspiration to take stakeholder engagement to social media. This

is shown by their presence on Twitter, however, here things become a bit ambiguous, at least

if we ask the question of why they have a presence on social media. They stress that there are

quite a few subjects that they cannot discuss on social media, and that they intend to read

tweets directed at them but are not be able to reply back to them all. They also state quite

clearly that their purpose on Twitter is to tweet about Corporate Sustainability (Novo Nordisk

5, 2012), which might indicate unidirectionality in their purpose. However, our

correspondence with the communications manager responsible for one of these accounts on

Twitter yielded a deeper understanding of the premise that builds constraints on their

engagements on Twitter.

9.1. CSR-Communication on Twitter

As a communications manager at Novo Nordisk, in his daily work he is responsible for

communicating about issues and news in relation to corporate sustainability. He manages the

48

Twitter account ‘@NovoNordiskTBL’, which with a total of 2.544 tweets, 1.970 followers and

1.908 followings seems to be an account with a solidified presence.

Being responsible for the daily dissemination of information on Twitter for a well-renowned

pharmaceutical company like Novo Nordisk of course brings about some challenges. First off,

in our correspondence it quickly became clear that they strive for a very high standard in the

content they send out into the Twittersphere. The background for this seems to be that Novo

Nordisk is a world leader in the area of diabetes care and has been for a long time. One might

easily imagine the demand for standards in the content of the information disseminated if, as

we assume is their wish, they are to maintain this status. In this position he daily makes

decisions on what to communicate about, which statements qualify for reiteration (retweet)

and as such follows a great demand for sources of information. The interest in gathering

information from sources on social media most probably sprung from this demand. As the

dataset provided was a list of accounts on Twitter this may indicate the recognition that these

can be treated as sources of information, which can be said to be congruent with our

perspective. Therefore we will attempt at applying the model in the context of Novo Nordisk

and the manager’s work in this company. However, before we move into this we need to cover

a challenge related to his work with communication on Twitter.

Since Novo Nordisk is a large pharmaceutical company dealing with products that directly

affect the health of individuals around the globe, they are of course subject to quite a few

regulatory restraints. When we found that they claim that there are some topics they cannot

discuss on social media there is actually a very reasonable explanation for this. There may be

numerous explanations in fact but the one we will include here quite clearly shows that it is

not possible to have an open dialogue on social media. It relates to industry guidance put forth

in recommendations by the Food and Drug Administration in the United States, where Novo

49

Nordisk also has products on the market. It relates to restrictions put on responses to

unsolicited requests for so-called off-label information about products.

“Statements that promote a drug or medical device for uses other than those approved or

cleared by FDA may be used as evidence of a new intended use. Introducing a product into

commerce for such a new intended use without FDA approval or clearance would, under

these requirements, generally violate the law.” (FDA, p. 2, 2011)

It should be fairly well-known that any drug or medical device has a clearly defined intended

use. Now this of course rhetorically presented as recommendations but in essence this states

that any comments on a product may be conceived by the FDA as the promotion of a new

intended use. A new intended use would then most likely need new approval from the FDA,

which if nothing else presents a stern warning. However, there is also the recommendation

that when an individual puts forth an unsolicited request for information the company must

grant such information only to the individual who asked for it. (FDA, p. 7, 2011)

“A firm should ensure that all pertinent background data are obtained to be able to

determine what information is being requested before providing a response.” (FDA, p. 7,

2011)

From this it seems very reasonable that a company like Novo Nordisk would adopt a non-

disclosure policy on social media in regard to some areas of business. First off, if they may

only disclose information to the individual who asked for it that can be said to hinder the

possibility of more perspectives on the same issue. Secondly, the fact that the company must

gather all pertinent background data before providing a response further supports such a

policy, given the inefficient nature of having to carefully evaluate every question directed at

them on Twitter.

Because of this we dare to suggest that our proposal might be well-suited for a company like

Novo Nordisk, since in essence we have armed ourselves with a different perspective on how

to interact with stakeholders on social media. Through our model a company like Novo

Nordisk might be able to attentively listen to the opinions and proposed solutions

50

disseminated by stakeholders on social media and from that gain insight into topics of

interest, and then present their view on those topics in a manner congruent with the given

regulatory constraints. In the following we will go through the steps proposed stopping at the

initial analysis of information, since we in the context of this thesis have no way of actually

deploying a CI-system into a company. As such we will focus on whether the information

disseminated would qualify to guide decisions.

9.2. Establishing a business case (domain)

As we have discussed in the two previous sections Novo Nordisk is a world leader in diabetes

care and as such the domain driving us forward is exactly that, the subject of diabetes. We

have also been through the importance of zoning in on a specific subject with regard to text

mining capabilities. As such even though Novo Nordisk has a stake in different areas we focus

solely on finding information about diabetes. Information that may aid the manager maintain

Novo Nordisk’s image as a world leader in diabetes.

We have already established that the preferred social technology in the context of this thesis

is Twitter and if we needed to do further research on this we would quite likely end up with

Twitter regardless. We can relate this to the capabilities of Twitter in disseminating

information, at least in so far as it rings true that part of his job to spread an image of Novo

Nordisk as a world leader in diabetes. We have already shown in section 7 that the likelihood

of such information reaching people it would not normally have reached is drastically

increased on Twitter (as per the compact network and retweets). As such we move directly to

explain how the stakeholders contained in the case material were selected.

9.3. Selection of stakeholders (balance diversity and expertise)

The communications from stakeholders that we will be analyzing in this case stems primarily

from a recommended list of Twitter accounts provided to us by Scott Dille. We left this choice

to him as diabetes as a subject is not exactly within the scope of our competencies, and he is

no doubt a business professional in this respect, which as we argued in section 8 makes it

reasonable to allow him to decide who to include. We did however analyze the participants to

51

ensure that we have both diversity and expertise in our stakeholder intelligence system. Here

we wish to draw out descriptions of a few accounts to portray our findings:

Pan American Health Organization

Twitter: @NCDs_PAHO (https://twitter.com/ncds_paho)

Twitter biography: “Learn what NCDs are, know the risk factors, and support the UN High-level

Meeting on NCDs and Wellness Week in NYC this September. PAHO tweets.”

Website: http://new.paho.org/hq/

Glu / T1D Exchange

Twitter: @MyGlu (https://twitter.com/myglu)

Twitter biography: “Glu is a new online community for people touched by type 1 diabetes. Glu is

part of the T1D Exchange - www.t1dexchange.org”

Website: http://t1dexchange.org

Peg Abernathy Group

Twitter: @PegAbernathy (https://twitter.com/pegabernathy)

Twitter biography: ”Diabetes Advocate with 18 years experience and 22 years Type 1.”

Website: http://pegabernathygroup.com

AmandaMichelleManait

Twitter: @sweetliferunner (https://twitter.com/sweetliferunner)

Twitter biography: “I write my experiences as a Diabetic Runner to inspire people. If I can, you

can too! I run not to win over other runners, I run to win over Diabetes!”

Website: http://thesweetliferunner.blogspot.com

This is only 4 of the sum total of 58 accounts provided to us and the total list presents a

potential of many diverse forms of perspectives all somehow coupled to diabetes. We note

here that in the list we have found a bias toward diabetes patients and non-professionals

otherwise affected by diabetes. In order to qualify the reason we stuck with this list we refer

to a correspondence we had with co-author of the book Rethinking Expertise. We asked Dr.

52

http://thesweetliferunner.blogspot.com/

https://twitter.com/sweetliferunner

http://pegabernathygroup.com/

https://twitter.com/pegabernathy

http://t1dexchange.org/

https://twitter.com/myglu

http://new.paho.org/hq/

https://twitter.com/ncds_paho

Robert Evans of Cardiff School of Social Sciences for his assessment on the perceived

expertise of diabetes patients. The following is an excerpt of his answer:

“From our perspective, T1 patients, providing they have been patients for a long time, will

be experts in the process of living with diabetes. In this case, time since diagnosis provides a

way of placing them on the ladder and puts them in the category of contributory experts. Of

course, you might want newly diagnosed patients too because what they don't know or

struggle with might be revealing too (e.g of what current information doesn't say enough

about!).” (Appendix-3)

To clarify contributory experts are by their definition the last step on the ladder of expertise

under the umbrella of what they categorize as specialist tacit knowledge. (Collins & Evans, p.

14, 2007) With this in mind we felt confident moving forward with the perception that we had

a diverse group of stakeholders with a sufficient amount of expertise on the subject of

diabetes present in the group as well.

9.4. Initial analysis of information quality

In this section we move into the presentation of our methods of data collection, cleansing and

analysis. We will lay out our approach in accordance with the descriptions presented in

section 6, where we described that we will be using text mining to categorize our dataset in

order to make sure that diabetes is the main topic of discussion, as well as perform a k-means

clustering activity on the tweets we collected in an effort to adhere to the concept of additive

aggregation.

Data collection

We collected the data through the popular open-source programming language for statistics

called R in which a package called twitteR can be downloaded. This package has the express

purpose of accessing different functionalities of Twitters API13 and we used this to download

tweets from each account. The number to download from each account was set to 150 in

order to keep dataset at a size we had computing power to tackle. We took only one sample

based on the fact that not all accounts held a total of 150 tweets, e.g. one account returned

13 https://dev.twitter.com/

53

https://dev.twitter.com/

only 6 and another returned 86. In actuality the list totaled 59 accounts to begin with but one

account had never sent tweet and as such was removed from the dataset. However, most of

the 58 accounts did indeed return 150 tweets and as such we ended up with a total of 7763

tweets. In the following whenever we refer to document we are referring to a single unique

tweet and when we refer to document collection we are referring to the total of 7763 tweets.

Data cleansing

For our data cleansing activities as well as the analysis to come we used another open-source

program called RapidMiner, which is a program for statistics, data mining and text mining

along with different types of reporting features. In RapidMiner we performed the required

preprocessing steps related to text mining in order to reduce the dimensionality of our

dataset. RapidMiner has the functionality to handle all of these and as such we started by

performing tokenization by non-letter, which means that RapidMiner divides each document

into word-level features by interpreting each non-letter as either the start or the end of a

word. We then transformed each upper-case letter to lower-case, which we did because the

program would otherwise interpret Diabetes and diabetes as different words. All stop words

in the dataset were then removed e.g. words like and, the, he, she, it because as we discussed

in section 6.2 these words hold little meaning, and to cater to methods of analysis we need the

simplest meaningful representation of a document. We then filtered tokens that had less than

two character-level features in them as these special characters like ‘/‘ were otherwise

present in the dataset. Furthermore, we stemmed each word using a built in method of

stemming, which for us served to make it easier to detect when words like diabetes were

54

present in a document. The main reasoning behind the choice of stemming was that we found

many different representations and ways of expressing that a tweet was about the subject of

diabetes. E.g. diabetic, diabetes and other representations were stemmed to ‘diabet’ and as

such we obtained a more complete picture of the representation of diabetes in the document

collection. Lastly, after studying the dataset and the dimensionality of the document collection

we decided to make use of an additional feature in RapidMiner called ‘prune method’. The

way this functions and the way we used it was to set it to only present us with words present

in at least 50 documents. We note here that some of these actions were necessary to perform

in order for our limited computing power to handle the dataset in analysis. However, we were

also satisfied that the document collection still held 168 different word-level features which

could be put into analysis.

Data analysis

Before we move into the presentation of the outcome of our work with the dataset we wish to

refer back to section 4, where we described that the purpose of this thesis is solely to explore

the possibilities of bringing theoretical aspirations of stakeholder theory into business

practice. As such our approach to the coming analysis has mainly been to assess the

55

possibility of deriving information from the stakeholders in our system, which is also

congruent with how we, in section 8, imagined the purpose of an initial analysis would be.

As we discussed in section 6.3 the process of categorization might be used as a way to

evaluate whether the stakeholders we have included are actively involved in discussions

regarding the domain that has our interest, which in our case is diabetes. Categorization is, as

described, about finding one or more categories for each document to belong to. In the case of

another text mining analysis the documents included in the document collection may have a

much-increased dimensionality in comparison to the documents in our collection. Imagine if

we were analyzing books to find an efficient way for categorization in a library. Each book in

this example would constitute a single document and as such you can imagine that the

dimensionality is quite different from tweets, where the maximum number of character-level

features is set to 140. In other words, having text mining decide on a category for the

documents in our collection might rightly be described as more of a superficial task. Because

of this we decided on categorizing the document collection as a whole in order to make sure

that we had a document collection, which held diabetes and information about this subject.

After performing the previously described preprocessing tasks we were left with a word list

presenting the most occurring terms in the document collection as a whole:

56

Furthermore our TF-IDF analysis of the document collection return the following result:

The word list shows the top occurring word-level feature was ‘http’, the second ‘co’ and the

third ‘diabet’. To provide some context to this ‘http’ and ‘co’ occurs on twitter each time a link

is contained in a tweet,14 which if we refer back to the section 7.1 according to the study by

Finin et al. (2007) was found to constitute the sharing of information. The third word-level

feature sorted by number of occurrences was ‘diabet’ with a total of 1781 occurrences. If we

compare this to the nearest feature in line ‘rt’ (an abbreviation of retweet) then we find that

frequency of ‘diabet’ in our collection is three times that of its nearest competitor.

Furthermore, ‘http’ and ‘co’ clearly occurs much more frequently than any other word-level

feature. Naturally, the links could quite possibly contain anything and the fact that the

occurrence is higher than ‘diabet’ might also show that we encounter linking to other things

than information regarding diabetes. Furthermore, as is shown in the results from the TF-IDF

analysis, ‘http’, ‘co’, and ‘diabet’ were also the word-level features in the document with the

highest average weight. From this we believe it to be a reasonable conclusion that we have a

collection of stakeholders whose main topic of discussion is diabetes. For now, we hold back

judgment on whether the information disseminated can be said to be about diabetes.

We move instead to present our results from the k-means clustering activity also carried out

in RapidMiner. As mentioned in section 6.4 the clustering activity, in our case, revolves

around on the basis of TF-IDF weights calculating a cosine similarity measure for each

document and then with that attempt to as best possible assign each document into a cluster.

In RapidMiner we can decide how many clusters the documents are to be clustered into and

after many failed runs resulting in buffer overload we decided on a number of 40 clusters.14 The reason for the presence of ’co’ instead of e.g. ’com’ is not stemming. It is that the vast majority of links on Twitter is abbreviated to lessen the amount of characters in them. In our dataset this abbreviation looks like this: http://t.co/(random string of characters)

57

RapidMiner has the functionality to present results from a clustering activity in numerous

ways. We found the most clarity of results in the graphical representations of a scatter plot

graph (this will represent our ‘given space’), which in RapidMiner allows us to triangulate

word-level features to find documents containing these. In other words, we could put in e.g.

‘diabet’, ‘thanks’ and ‘happy’, and it would show us documents containing those words and the

relative distance measure between them. Since we were trying to confirm the presence of

documents with reference to information about diabetes we found the most effective way to

have two stable word-level features, namely, ‘diabet’ and ‘http’. The first scatter plot we here

will include shows a collection of documents containing the features ‘diabet’, ‘http’ and ‘co’:

58

Each dot on this graph represents a single document; the further it is placed along the y-axis

the more weight the feature ‘diabet’ has in the document, the further along the x-axis the

more weight the feature ‘http’ has. The color coding which goes from blue to red then

represents the weight of the feature ‘co’, where the navy-blue color means no weight and the

red high weight. This quite clearly shows that we have a very large amount of documents

referring to diabetes as well as a link. When we started looking for more interesting

triangulations using this functionality we moved ‘http’ to the color coding and kept ‘diabet’

along the y-axis. Had we not done this with every result we got looking like this with some

documents being colored to represent the newly included word-level feature, which made it

difficult to get an overview of the documents present in the given collection. In total we found

six additional interesting representations in our dataset two of which we include and discuss

in this section, while the rest can be found in the appendix-4. The first was ‘diabet’, ‘http’ and

‘insulin’:

59

On this graph each dot that is not navy-blue contains the features ‘diabet’, ‘http’ and ‘insulin’.15

We were able to confirm this through manual inspection of the documents on the graph and

we will here bring a few examples of the contents in these documents.

The first is the second rightmost cyan-colored dot, which is a document from the user

‘@sstrumello’ and he tweets:

"Thermalin Diabetes gets $4.5M NIH grant for next-generation insulin

analogue http://t.co/yymFcECb via @MedCityNews" (sstrumelloTweets-

10.txt)16

The link in this tweet took us to a news article explaining the, as described by the user, that a

$4.5M grant had been given to the company Thermalin Diabetes in order for them to develop

a new insulin-related product. The next and last we will include from this scatter plot is from

the user ‘@JoyofDiabetes’ and tweet contained the following:

15 Unfortunately upon importing images of the scatter plots into word the quality of the image fell drastically no matter the quality chosen upon exporting them from RapidMiner. We were not able to fix this issue but we hope they provide enough detail to show the color coding as well as the distribution.16 These text-files we refer to here can be found in the zipped folder ’Data.zip’ in the folder ’Finds’ accompanied with the thesis.

60

http://t.co/yymFcECb

"RT @HealthyNews_WR High blood sugar and insulin levels linked to

#heart disease: http://dld.bz/bCD #diabetes #diabetic"

(JoyofDiabetesTweets-128.txt)

This again took us to a news article this time revealing research that had shown a link

between high blood sugar and risk of developing heart disease. In total we found 22 different

tweets from 9 different users, which contained the relevant word-level features. The most

recurring topic was some form of information pertaining to insulin pumps but the contents of

the links varied too much to constitute a significant pattern of discussion in relation to insulin

pumps. The 22 tweets in their full length can be found in the attached data in the folder

‘diabet-http-insulin’.

The second scatter plot we wish to include here is one with the features ‘diabet’, ‘http’ and

‘research’:

Again we bring a couple of examples of the contents in the tweets distributed on this scatter

plot. One user who calls himself ‘@DiabetesRx’ on Twitter had this to say:

61

http://dld.bz/bCD

"Fasting lowers risk of heart disease, diabetes | The Salt Lake Tribune

http://t.co/JrnpQjY via @AddThis-However, Research was SHORT TERM!"

(DiabetesRxTweets-96.txt)

This links again led us to a news site but this time the news cited research pertaining to the

perceived benefits of fasting in relation to diabetes. Another user ‘@DiabetesPower1’ was also

disseminating information regarding research, the tweets indicates this and the link confirms

it:

"CU Researchers Find Cure For Type 1 Diabetes In Mice ¬´ CBS Denver

http://t.co/JfktbGcC” (DiabetesPower1Tweets.txt)

We found 13 tweets from 10 different users, which contained relevant features in relation to

this second scatter plot. Again the contents were too diverse to find a pattern displaying a

specific interest in one area of research but again the presence of word-level features in the

tweets at least displayed a shared interest in diabetes research.

We found many different styles of presentation and different types of content in the links

disseminated by the stakeholders in our pretend CI-system. Generally, it was difficult to zone

in on patterns in the information disseminated other than the inherent feature-patterns. So

we conclude the presentation of our results here and refer to the appendix-4 for the

remaining four scatter plots, the document collections for these can also be found in the

attached dataset in the folder ‘Finds’. The results yielded by the clustering activity showed

that without a doubt the stakeholders in our system are talking and spreading information

about diabetes. We referred to Finin et al. (2007) and focused on links to find information but

the remainder of the documents in the scatter plots was also communicating with reference to

the stable feature ‘diabet’ and the other non-stable feature. As such even though they did not

refer to an external source of information they were tweeting with some reference diabetes

and a subject within the scope of our interest. Generally, the document collection contained a

healthy mix of the user intentions daily chatter, conversations, sharing information and

reporting news. We did however find a tendency, also indicated in the previously portrayed

word list, toward diabetes and some link to an external source of information. However, we

62

http://t.co/JfktbGcC

http://t.co/JrnpQjY

were not able to conclusively state whether the information disseminated in the system is of

the needed relevance or quality. Ideally we would have liked to see many accounts tweeting

about something more specific pertaining to diabetes. However, based on the clustering

activity we found at most indications of this. In the coming we discuss our results in relation

to the proposal as a whole and in this we also refer to the correspondence we had with the

manager from Novo Nordisk, and his views on the needed information quality takes part in

this evaluation.

10. Evaluation of results and modelIn section 6 we discussed the basic foundation of text mining and two methods, categorization

and clustering respectively, which we conceived as an answer to the deliverance of valuable

information from stakeholders on social media. We applied such methods in our data analysis

and found that categorization may be able to help us assess whether our preconceived

notions of relevance were consistent with the content of the data we worked on. In our case

relevance was tied to the domain of diabetes and we found the talk and the information

shared to be about diabetes. Clustering delivered to us indications that the stakeholders in our

system were at some point involved, by presence of word-level features in their

communications, in the same issues. However, the results were not significantly consistent so

as to with any certainty claim that the stakeholders were speaking to the exact same issues.

Even upon focusing our analysis on word-level features e.g. ‘diabet’, ‘http’ and ‘insulin’ there

may yet be many different issues to decipher in communications about these. In this section

we wish to discuss our method of analysis in relation to the prospect of bringing such

information into a company. Upon having done this we zoom out and highlight challenges

related to our proposal as a whole.

Categorization and Clustering

Reliance on the results produced by a process of categorization we feel demands some insight

into computational elements of the method. Broadly speaking, categorization holds many

different methods some of which are more complex than others. We chose TF-IDF, an

automated approach, for ease of access and because as we explained in the previous section

each document in our collection could contain a maximum of 140 character-level features. We

63

saw that ‘http’, ‘diabet’ and ‘co’ held the highest weight in our dataset and while as stated this

confirmed our belief that diabetes was the topic of discussion, the way this is calculated is by

presence of ‘diabet’ in the document collection. E.g. we have two tweets construed of word-

level features, one with six “diabetes patients demand better insulin pumps” and another with

four “diabetes tower diabetes squared”. In this example the second tweet will contain the

highest weight even though the content is utterly meaningless. The clustering task also needs

TF-IDF weights to apply the cosine similarity measure and start clustering the documents.

This speaks to the difficulty in bringing language into computation and analysis but this does

not deter us from the belief that text mining will be a valuable tool in this. As is evident in the

results in section 9.4 text mining did actually lead us to meaningful communications on social

media. However, it underlines the importance of carefully assessing what information is to be

derived, what the quality of that information has to be and how we might couple methods to

deliver the best possible solution in the context of a text mining system. We spoke to the

manager at Novo Nordisk about this and asked him questions in relation requirements for

both the technological aspects as well as the quality of information derived. The

correspondence can be found in its entirety in appendix-2.

Bringing information into the business

The answers given in regard to information quality hinged heavily on the fact that Twitter

was the medium for communication and information dissemination. In other words quality of

information was set heavily in the context of Twitter functionality. Interestingly, his answer to

what might constitute information of a quality that would allow him to act on it (communicate

about it) deviated from our perceptions of importance, which might again bespeak the

importance of including the business professional in the conception of such a CI-system. The

following is an excerpt from his answers:

“I would say actionable intelligence is information that provides guidance on the likelihood

that a message will be perceived as valuable based on some evidence (i.e. it is a hot topic

among key influencers or the community), the probability that a message will be

ReTweeted if I post it, and the probability that the community I am targeting will spread

my message. What I would like is to make decisions to Tweet or ReTweet with some

evidence and insight into how the message will be received and promoted.” (Appendix-2)

64

When asked about his view on how many such stakeholders would have to have shown

interest in a subject for it to be classified as a ‘hot topic’ the answer was 1000+ stakeholders.

(Appendix-2) As such we quite clearly see that the information we were able to derive from

stakeholders in our system cannot be of enough quality by sheer lack of numbers.

Additionally, calculating the probability that a message will be retweeted and spread through

the community could possibly be conceived as some calculation on historical data collected in

the given CI-system on Twitter. We see no reason that this should not be a possibility in

practice but such a claim is outside the scope of this thesis. This does however bespeak

another issue with our method, namely, that we lack a clear timeframe for the data analyzed.

In actuality we have no way of assessing the timely relevance of the topics discussed in our

selection of stakeholders. Of course there is no way to ensure that stakeholders selected will

keep on discussing diabetes, however, some measure of relative activity for each user would

perhaps be a fruitful addition. It would seem that all of the issues presented here relates to the

dataset used in the thesis and not as much the methods applied on the data. Therefore, we

allow ourselves to conjecture that if we establish a CI-system with a much larger number of

unique stakeholders communicating about the same subject, we might be better braced to

find information by way of additive aggregation that may aid us in driving decisions. Lastly,

we wish to point out that we are aware that the information when being brought into the

company would need to be presented in an efficient way. In a position like that of the

communications manager it seems unlikely that he would have time to sit and sift through the

links present in the communications. Especially as the number of communications included in

the system increases. As such we end this discussion by noting that for the model to be more

than conceptually applicable most probably some method for summarizing the link-contents

and recognizing links referring to the same information is needed.

The issue of trust and the difficulty of assessing legitimacy

This discussion relates to the demand of how we can ensure the balance between diversity

and expertise. In actuality what we have based our judgments on is an estimate provided by

the manager in the case. However, this grants little guidance for companies outside the scope

of this thesis. In such cases judgments would most likely have to be based on what

information the account reveals about the owner and an analysis of the information

65

disseminated by the account. The following portrays some of the difficulties in relation to

assessing the relative legitimacy of an account on social media, and thereby also the

information disseminated by that account. Leimeister (2010) speaks to this issue in general

terms, when he states the actors in the system may act out and even in some cases may be out

to cause sabotage by e.g. spreading false information or acting rude. (Leimeister, p. 247, 2010)

Manovich (2011) claims that we must be careful reading communications over social media

as authentic:

“Peoples’ posts, tweets, uploaded photographs, comments, and other types of online

participation are not transparent windows into their selves; instead, they are often

carefully curated and systematically managed.” (Manovich, p. 6, 2011)

Dellacoras (2003) reiterates this perspective in his study of online feedback mechanisms,

where he takes issue with what he terms as the volatility of online identities and the ability to

precisely design the distributed message. (Dellacoras, p. 1422, 2003) These issues of course

also affect the legitimacy of our proposal and puts focus on two relevant questions of how we

can trust that stakeholders are who they claim to be, as well as how we might assess the

trustworthiness of the information disseminated. This discussion dates as far back as to

Kozinetz (1998) in one of the first studies of the interaction between people online. (Kozinetz,

1998, n.p.) He, however, also suggests how we might alleviate the problem when he states:

“Over time, with patient observation of any virtual community, with a few key informants

with whom one has built a strong and trusting relationship, and with a deep understanding

of one’s own inner identification as a culture member, a netnographer is likely to be able to

separate the wheat from the chaff, and construct a representation faithful to the

interpretations of bona fide culture members.” (Kozinetz, 1998, n.p.)

If we conceive of the list provided to us by the communications manager at Novo Nordisk as a

list containing stakeholders he perceives to be worthy of trust have we then solved this issue?

Most likely not but with our proposal we are ourselves involved in assessing the relevance

and trustworthiness of an account as well as the information disseminated from that account.

As such regardless of who we include and what information we dare to trust, it will be our

66

choice and as such we carry the fault if things go awry. In addition, if we dare to conjecture

that at least in some cases the given business professional included will be able to, from his

experience, to grant insights into this and as such perhaps alleviate some of the stress

surrounding the issue. We believe it to be true that these perspectives are valid

representations of difficulties on the net but if nothing else, the emergence of social media has

at least made it easier to assess who the person behind the account is. Lastly, in line with this

we find it quite reasonable to not base decisions on e.g. the words in a tweet. Therefore we

note again that for us to be more trusting of the information collected we would most likely

need the ability to effectively assess the content of the links distributed.

Influence and Community

The project of this thesis is as stated previously to provide insight into the possibilities of

taking theoretical aspirations of stakeholder theory and molding them into practice. We

present here an overall evaluation of our model and assess whether we have succeeded in

this. Upon going through contemporary efforts in stakeholder theory and correlating this with

theoretical capabilities of collective intelligence and business intelligence we were able to

conceptualize a model for stakeholder engagement on social media. However, one of these

aspirations were to move from a view of engagements were the firm exclusively decides on

each important aspect of the engagement and to one where an, in theory, unbounded number

of participants decide what is important. We realize that upon reaching a model we have

landed on a perspective, where in one aspect we are back to a traditional view of stakeholder

engagement. However, we believe the model to be able to provide a more meaningful

inclusion of stakeholders than what a mere presence on social media provides. We do this by

allowing stakeholders to be included in a system and let their voices be heard. The most

significant difference lies in the suggestion that the company passively observes the system,

which may aid them in escaping bias-distortions in the decision-making processes

surrounding stakeholder engagement. However, how such a system is used and if the core

principles of including many and letting them decide are adhered to we of course cannot say.

The theory behind our model clearly states the benefits of an unintrusive approach but in

implementation and use it will rely on the company’s actions in relation to this. This issue

seems impossible to escape.

67

Furthermore, our model deliberately stays passive and stakeholder engagement can hardly be

defined as a passive activity. We find two questions to be relevant in relation to this. One of

how stakeholders are to be made aware that they are included in such a system so as to be

able to feel a part of it and feel that they are being heard? The other of how passively listening

in and subsequently communicating about what is heard affects both the outcome of the

interaction and the relationship between the company and the stakeholder?

The first is undoubtedly difficult to answer and will again most likely be handled differently

from one company to another. In section 3 we explained in short Bonabeau’s view that our

human nature tends to lead us to favor information that fits the state of our current beliefs.

This is reiterated by research carried out by Dan Ariely (2010) in which he defined the

presence of the so-called Not-Invented-Here bias that also speaks to the fact that humans have

a clear tendency to favor their own ideas over ideas conceived by others. (Ariely, chap. 4,

2010) If this is so there may be an indication that when we have collected the opinions and

solutions from stakeholders in our system they may be likely to positively perceive the

subsequent communication from a company. However, this will obviously rely on the

individual stakeholder’s ability to recognize her own contribution at the time of subsequent

communication or action we as a company take in relation to that contribution. Another

solution may be to simply ask stakeholders whether they would be interested in being

included in such a system. However, we will refrain from claiming to have an answer to the

best solution in this regard.

The second question is posed because we recognize that in essence our model has the

purpose of bridging the gap between company and stakeholders by allowing them to enter

into decision-making processes. We do so by analyzing communications and from that

deriving intentions of our stakeholders based on the content of these. However, we also

realize that the position we have sat ourselves in could be interpreted as quite the position of

power. We have not discussed and as such we must assume that stakeholders have no way of

explaining the intentionality behind their communications. To provide further explanation of

this issue we refer to Habermas (1985) and the following citation:

68

“…Owing to this linguistic structure, it (communicatively achieved agreement) cannot be

merely induced through outside influence; it has to be accepted or presupposed as valid by

the participants…A communicatively achieved agreement has a rational basis; it cannot be

imposed by either party...Agreement can indeed be objectively obtained by force; but what

comes to pass manifestly through outside influence or the use of violence cannot count

subjectively as agreement. Agreements rests on common convictions.” (Habermas, p. 287,

1985)

What he states here is, in short, that in order for us to truly be able to claim that we have an

understanding of what our stakeholders are saying, we cannot take a position of power from

which we sit and decide upon the intended meaning in a communication. If this is so then we

obviously run into trouble and whether this issue can be alleviated in the context of

stakeholder engagement seems difficult to answer. In short, it would seem to take us back to

face-to-face engagements as the only means to capture the essence of an issue, which is not a

possibility on social media. Additionally, when we are moving in the context of business we

undoubtedly have many pieces that must come together in order for a complete puzzle to

emerge. It may seem that in order for us to bring our model into practice we cannot capture

all of them and as such it will be up to the company in question whether this issue makes the

model unusable.

11. ConclusionWe have come to the end of our attempts to find a model for stakeholder engagement on

social media. We found that social media most likely does not accommodate traditional forms

of engagement and in that difficulty found inspiration to apply other perspectives on

stakeholder engagement. These were partially found in the context of the logic of community

but primary inspirations emerged from the theory of collective intelligence. This field

provided support for the perspective that stakeholder engagement can be taken to social

media, and can gain valuable insights from doing so. We moved forward in an attempt to

consolidate these theoretical perspectives with practical applicability, and in doing so we

found both possibilities as well as challenges. Much can be said to favor the use of text mining

in this regard and we strongly believe that as this field evolves with it will the practicality and

value of our proposed model. Additionally, we also believe that as more research provides

69

insight into the technological foundation of, as well as users of, social technologies the model

may gain in strength. We admit to the many inherent challenges coupled with deriving any

sorts of information from social media but also believe that our model may be applied given

that a company has sight for these. In the end the challenges do not present the impossibility

of taking stakeholder engagement to social media. In our view it states simply that there may

be no replacement for the values derived from face-to-face engagements, but it would not

seemingly mean that there is no value to gain from taking stakeholder engagement to social

media. It seems to depend highly upon the purpose. If the purpose is to reach more

stakeholders and allow these to form some sort of relation to the company, then engagement

on social media suddenly seems an invaluable tool. Furthermore, if the purpose is the

spreading of a company’s image then the capabilities of a medium like Twitter seem

indisputable. In turn, if the purpose is getting to the bottom of an issue posed by the fact that a

company’s actions affect an external environment then real-world meetings, conferences and

the like will most likely still be the best approach, since in such a case it may well be essential

to have a complete understanding of the views held by stakeholders. However, we are

satisfied that we have shown the possibility of taking stakeholder engagement to social media.

12. Bibliography

12.1. Articles (Order of Appearance)

1: Porter, M.E. (1979) How Competitive Forces Shape Strategy, Harvard Business Review

2: Bennett, W.L. (2003), New Media Power: The Internet and Global Activism, Oxford: Rowman & Littlefield

3: Castelló, I., Etter, M. & Morsing, M. (Forthcoming), Why Stakeholder Engagement will not be Tweeted: Logic and Conditions of Authority Corset, Paper Presented at the Academy of Management 2012, Boston, USA.

4: Scherer, A.G., Palazzo, G. (2011) The New Political Role of Business in a Globalized World: A Review of a New Perspective on CSR and its Implications for the Firm, Governance, and Democracy, Journal of Management Studies, 48(4): 889-931.

5: Bughin, J., Byers, A. H., Chui, M. (2011), How social technologies are extending the organization, McKinsey Global Institutehttp://www.mckinseyquarterly.com/How_social_technologies_are_extending_the_organization_2888 (Seen on 24/7-2012)

70

http://www.mckinseyquarterly.com/How_social_technologies_are_extending_the_organization_2888

http://www.mckinseyquarterly.com/How_social_technologies_are_extending_the_organization_2888

6: Hutton, G., Fosdick, M. (2011), The Globalization of Social Media - Consumer Relationships with Brands Evolve in the Digital Space, Journal of Advertising Research

7: Nonaka, I. (1994), A Dynamic Theory of Organizational Knowledge Creation, Organization Science, Vol. 5, No. 1, February 1994

8: AccountAbility (2011), AA1000 Stakeholder Engagement Standard 2011http://www.accountability.org/about-us/publications/index.html (Seen on 24/7-2012)

9: Dellacoras, C. (2003), The Digitization of Word of Mouth: Promise and Challenges of Online Feedback Mechanisms, Management Science INFORMS Vol. 49, No. 10, pp. 1407–1424

10: Hypatia Research LLC (2011), Benchmarking Social Community Investments & ROI: Best Practices Vendor Selection Guide, ©Hypatia Research LLC

11: Leimeister, J. M. (2010), Collective Intelligence, Gabler Verlag

12: Malone, T. W., Laubacher, R., Dellacoras, C. (2009), Harnessing Crowds: Mapping the Genome of Collective Intelligence, MIT Center for Collective Intelligence, Working Paper No. 2009-001

13: Bonabeau, E. (2009), Decisions 2.0: The power of collective intelligence, MIT Sloan Management Review, 50(2), 45-52

14: Barnes, N. G., Andonian, J. (2011), The 2011 Fortune 500 and Social Media Adoption: Have America's Largest Companies Reached a Social Media Plateau?, University of Massachusetts Dartmouthhttp://www.umassd.edu/cmr/studiesandresearch/2011fortune500/ (Seen on 24/7-2012)

15: Luhn, H. P. (1958), A Business Intelligence System, IBM Journal of Research and Development October 1958 (p. 314-319)

16: Lovejoy, K., Waters, R. D., Saxton, G. D. (2012), Engaging Stakeholders through Twitter: How nonprofit organizations are getting more out of 140 characters or less, Public Relations Review 38 (2012) 313–318

17: Kwak, H., Lee, C., Park, H., Moon, S. (2010), What is Twitter, a Social Network or a News Media?, Proceedings of the 19th International World Wide Web (WWW) Conference, April 26-30, 2010, Raleigh NC (USA)

18: Finin, T., Java, A., Song, X., Tseng, B. (2007), Why We Twitter: Understanding Microblogging Usage and Communities, Joint 9th WEBKDD and 1st SNA-KDD Workshop ’07, August 12, 2007

19: Imhoff, C., White, C. (2011), Self-Service Business Intelligence - Empowering Users to Generate Insights, TWDI Best Practices Report - Third Quarter 2011

20: Cohen, K. B., Hunter, L. (2008), Getting Started in Text Mining, PLoS Comput Biol 4(1): e20.

71

http://www.umassd.edu/cmr/studiesandresearch/2011fortune500/

http://www.accountability.org/about-us/publications/index.html

doi:10.1371/journal.pcbi.0040020

21: Food and Drug Administration (2011), Draft Guidance for Industry on Responding to Unsolicited Requests for Off-Label Information About Prescription Drugs and Medical Devices, Food and Drug Administrationhttps://www.federalregister.gov/articles/2011/12/30/2011-33550/draft-guidance-for-industry-on-responding-to-unsolicited-requests-for-off-label-information-about (Seen on 24/7-2012)

22: Manovich, L. (2011), Trending: The Promises and Challenges of Big Social Datahttp://manovich.net/articles/ (Seen on 24/7-2012)

23: Kozinetz, R. V. (1998), ON NETNOGRAPHY: INITIAL REFLECTIONS ON CONSUMER RESEARCH INVESTIGATIONS OF CYBERCULTURE, Advances in Consumer Research Volume 25http://www.acrwebsite.org/volumes/display.asp?id=8180 (Seen on 24/7-2012)

12.2. Books (Order of Appearance)

1: Freeman, R. E. (1984), Strategic Management: A stakeholder approach, Boston: Pitman, ISBN: 0-273-01913-9

2: Li, C., Bernoff, J. (2008), Groundswell: Winning in a World Transformed by Social Technologies, Harvard Business School Press, ISBN: 1422125009

3: Collins, H., Evans, R. (2007), Rethinking Expertise, The University of Chicago Press, ISBN: 0-226-11360-4

4: Hacking, I. (1999), The Social Construction of What?, Harvard University Press, ISBN 0-674-81200-X

5: Berry, M. J. A., Linoff, G. S. (2004), Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management, John Wiley & Sons, New York, ISBN: 0470650931

6: Feldman, R., Sanger, J. (2006), The Text Mining Handbook - Advanced Approaches in Analyzing Unstructured Data, Cambridge University Press, ISBN: 0-521-83657-3

7: Ariely, D. (2010), The Upside of Irrationality: The Unexpected Benefits of Defying Logic at Work and at Home, Harper, ISBN: 0061995037

8: Habermas, J. (1985), The Theory of Communicative Action, Beacon Press, ISBN: 0807015075

12.3. Links (Order of Appearance)

1: AccountAbility Website:www.AccountAbility.orgSeen on 24/7-2012

72

http://www.AccountAbility.org/

http://www.acrwebsite.org/volumes/display.asp?id=8180

http://manovich.net/articles/

https://www.federalregister.gov/articles/2011/12/30/2011-33550/draft-guidance-for-industry-on-responding-to-unsolicited-requests-for-off-label-information-about

https://www.federalregister.gov/articles/2011/12/30/2011-33550/draft-guidance-for-industry-on-responding-to-unsolicited-requests-for-off-label-information-about

2: Google Example:https://plus.google.com/u/0/100585555255542998765/posts/h7LNZ8zUAdFSeen on 24/7-2012

3: Wefollow Website 1:www.wefollow.comSeen on 24/7-2012

4: Wefollow Website 2:http://wefollow.com/twitter/ngoSeen on 24/7-2012

5: Wefollow Website 3:http://wefollow.com/twitter/advocacySeen on 24/7-2012

6: Languagemonitor Website:http://www.languagemonitor.com/global-english/number-of-words-in-the-english-language-1008879/Seen on 24/7-2012

7: Twopcharts Website:http://twopcharts.com/twitter500millionSeen on 24/7-2012

8: TDWI Website:www.tdwi.orgSeen on 24/7-2012

9: Novo Nordisk 1: http://www.novonordisk.com/about_us/about_novo_nordisk/introduction.aspSeen on 24/7-2012

10: Novo Nordisk 2:http://www.novonordisk.com/about_us/history/milestones_in_nn_history.aspSeen on 24/7-2012

11: Novo Nordisk 3:http://www.novonordisk.com/sustainability/sustainability-approach/stakeholder-engagement.aspSeen on 24/7-2012

12: Novo Nordisk 4:http://annualreport2011.novonordisk.com/stakeholders-and-reporting/stakeholders/memberships.aspx

73

http://annualreport2011.novonordisk.com/stakeholders-and-reporting/stakeholders/memberships.aspx

http://annualreport2011.novonordisk.com/stakeholders-and-reporting/stakeholders/memberships.aspx

http://www.novonordisk.com/sustainability/sustainability-approach/stakeholder-engagement.asp

http://www.novonordisk.com/sustainability/sustainability-approach/stakeholder-engagement.asp

http://www.novonordisk.com/about_us/history/milestones_in_nn_history.asp

http://www.novonordisk.com/about_us/about_novo_nordisk/introduction.asp

http://www.tdwi.org/

http://twopcharts.com/twitter500million



http://wefollow.com/twitter/advocacy

http://wefollow.com/twitter/ngo

http://www.wefollow.com/

https://plus.google.com/u/0/100585555255542998765/posts/h7LNZ8zUAdF

Seen on 24/7-2012

13: Novo Nordisk 5:http://annualreport2011.novonordisk.com/stakeholders-and-reporting/stakeholders/partnerships.aspxSeen on 24/7-2012

14: Pan American Health Organization:https://twitter.com/ncds_pahohttp://new.paho.org/hq/Seen on 24/7-2012

15: Glu / T1D Exchange:https://twitter.com/mygluhttp://t1dexchange.orgSeen on 24/7-2012

16: Peg Abernathy Group:https://twitter.com/pegabernathyhttp://pegabernathygroup.comSeen on 24/7-2012

17: AmandaMichelleManait:https://twitter.com/sweetliferunnerhttp://thesweetliferunner.blogspot.comSeen on 24/7-2012

18: Twitter Developer Site:https://dev.twitter.com/Seen on 24/7-2012

19: Tweet from ’@sstrumello’:http://t.co/yymFcECb Seen on 24/7-2012

19: Tweet from ‘@JoyofDiabetes’:http://dld.bz/bCDSeen on 25/7-2012

20: Tweet from ‘@DiabetesRx’:http://t.co/JrnpQjY

74

http://t.co/JrnpQjY

http://dld.bz/bCD

http://t.co/yymFcECb

https://dev.twitter.com/

http://thesweetliferunner.blogspot.com/

https://twitter.com/sweetliferunner

http://pegabernathygroup.com/

https://twitter.com/pegabernathy

http://t1dexchange.org/

https://twitter.com/myglu

http://new.paho.org/hq/

https://twitter.com/ncds_paho

http://annualreport2011.novonordisk.com/stakeholders-and-reporting/stakeholders/partnerships.aspx

http://annualreport2011.novonordisk.com/stakeholders-and-reporting/stakeholders/partnerships.aspx

Seen on 25/7-2012

21: Tweet from ‘@DiabetesPower1’:http://t.co/JfktbGcC Seen on 25/7-2012

12.4. Programs Used (Order of Appearance)

1: The R Project for Statistical Computinghttp://www.r-project.org/

2: RapidMinerhttp://rapid-i.com/content/view/181/196/Rapid-I GmbH

75

http://rapid-i.com/content/view/181/196/

http://www.r-project.org/

http://t.co/JfktbGcC

pure.au.dkpure.au.dk/...on...Kasper_Br_db_k_Christensen_Elektron… · Web viewby taking the...

Documents

Transcript of pure.au.dkpure.au.dk/...on...Kasper_Br_db_k_Christensen_Elektron… · Web viewby taking the...