The Danger of Big Data - Social Media as Computational Social Science

download The Danger of Big Data - Social Media as Computational Social Science

of 15

Transcript of The Danger of Big Data - Social Media as Computational Social Science

  • 7/27/2019 The Danger of Big Data - Social Media as Computational Social Science

    1/15

    10/5/13 Oboler

    firstmonday.org/ojs/index.php/fm/rt/printerFriendly/3993/3269

    First Monday, Volume 17, Number 7 - 2 July 2012

    Social networking Web sites are amassing vast quantities of data and computational social science isproviding tools to process this data. The combination ofthese two factors has significant implicationsfor individuals and society. With announcements of growing data aggregation by both Google andFacebook, the need for consideration of these issues is becoming urgent. Just as Web 2.0 platformsput publishing in the hands of the masses, without adequate safeguards, computational socialscience may make surveillance, profiling, and targeting overly accessible.

    The academic study of computational social science explains the field as an interdisciplinaryinvestigation of the social dynamics of society with the aid of advanced computational systems. Suchinvestigation can operate at the macro level of global attitudes and trends, down to the personallevel of an individuals psychology. This paper uses the lenses of computation social science toconsider the uses and dangers that may result from the data aggregation social media companiesare perusing. We also consider the role ethics and regulation may play in protecting the public.

    Contents

    IntroductionComputational social science and societal riskThe use of computational social science in the social media sphereThe aggregation of data brings additional riskA lesson from the social sciencesThe business customers of social mediaThe government as protector of consumer rightsThe government as a user of dataPlatform providers and the ethical development of social mediaPlatform users and the ethical use of social mediaConclusion

    Introduction

    Computational social science is the interdisciplinary investigation of the social dynamics of society,conducted from an information perspective, through the medium of advanced computationalsystems (CioffiRevilla, 2010). Computational social science can span all five traditional socialscience disciplines: social psychology, anthropology, economics, political science and sociology. Itcan operate at various level of analysis: individual cognition, decision making, and behaviour; groupdynamics, organization and management; and, societal behaviour in local communities, nationstates and the world system.

    Computational social science is, like microbiology, radio astronomy, or nanoscience, an instrumentbased discipline (C ioffiRevilla, 2010). Through a key instrument, an instrument based disciplineenables the observation and empirical study of phenomena. Whether the instrument is amicroscope, radar, electron microscope or some other tool, the instrument serves as a lens makingan otherwise invisible subject matter visible to the observer. In computational social science, the

  • 7/27/2019 The Danger of Big Data - Social Media as Computational Social Science

    2/15

    10/5/13 Oboler

    firstmonday.org/ojs/index.php/fm/rt/printerFriendly/3993/3269 2

    instrument takes the shape of computer systems and datasets; their availability and sophisticationdrives the development of theory, understanding and practical advances.

    As computing power becomes faster, software more sophisticated and society more willing to putdata online, computational social science is expanding in two directions simultaneously. Thehorizontal expansion sees greater access to relatively shallow aggregations of data; while thevertical expansion enables lager operators to add depth to their models by aggregating relatedinformation from multiple sources. A prime example of this is the recent announcement by Google ofchanges to its privacy policy and terms of service; Google itself says this means: if youre signedin, we may combine information youve prov ided from one service with information from otherservices. In short, well treat you as a single user across all our products (Whitten, 2012).

    As the reach of computational social science grows, questions of both methodology and ethics,drawn from the underlying fields of computational and social sciences, need to be considered. Theseconsiderations apply not only to the research context, but also, and more importantly, to the worldsof government and commerce where philosophical concerns are less likely to rebuff immediatepractical benefits. Most significantly, these concerns need to be considered in the context of socialmedia platforms which have become computational social science tools that sit in easy reach ofbusinesses, governments, private citizens, and the platform operators themselves.

    Without a resolution to outstanding ethical issues on data storage, access and use by actors in avariety of different roles, advancements in computational social science may put the public atincreased risk. To date, research in this area has been limited. We aim to provoke thought anddiscussion on how the use of social media as a computational social science tool should beconstrained, both legally and ethically, to protect society. Prior to the public listing of Facebook onthe Nasdaq, CEO Mark Zuckerberg announced his core values to potential investors; Zuckerbergs

    promotion of risk taking, of the need to move fast and break things, highlights the need forexternal constraints so society is not left bearing the cost of mistakes by social media innovators(Oboler, 2012).

    This paper begins with a consideration of the nature and risks of computational social science,followed by a focus on social media platforms as social science tools. We then discuss theaggregation of data and the expansion of computational social science along both horizontal andvertical axes. We consider the problems aggregation has raised in past social science research, aswell as the potential problems raised by the use of social media as computational social science bybusiness customers, government, platform providers and platform users; this discussion includesconsideration of consumer protection, ethical codes, and civil liberty impacts. We end by highlightingthe richness of social media data for computational social science research and the need to ensurethis data is used ethically and the public is protected from abuse. The danger today is thatcomputational social science is being used opaquely and near ubiquitously, without recognition orregard for the past debate on ethical social science experimentation.

    Computational social science and societal risk

    Computational social science involves the collection, retention, use and disclosure of information toanswer enquiries from the social sciences. As an instrument based discipline, the scope ofinvestigation is largely controlled by the parameters of the computer system involved. Theseparameters can include: the type of information people will make available, data retention policies,the ability to collect and link additional information to subjects in the study, and the processingability of the system. The capacity to collect and analyze data sets on a vast scale provides leverageto reveal patterns of individual and group behaviour (Lazer, et al., 2009).

    The revelation of these patterns can be a concern when they are made available to business andgovernment. It is, however, precisely business and government who today control the vastquantities of data used for computational social science analysis.

    Some data should not be readily available: this is why we have laws restricting the use of wiretaps,and protecting medical records. The potential damage from inappropriate disclosure of information issometimes obvious. However, the potential damage of multiple individually benign pieces ofinformation being combined to infer, or a large dataset being analysed to reveal, sensitiveinformation (or information which may later be considered sensitive) is much harder to foresee. Alack of transparency in the way data is analysed and aggregated, combined with a difficulty inpredicting which pieces of information may later prove damaging, means that many individuals havelittle perception of potential adverse effects of the expansion in computational social science.

    Both the analysis of general trends and the profiling of individuals can be investigated through socialsciences. Applications of computational social science in the areas of social anthropology andpolitical science can aid in the subversion of democracy. More than ever before, groups or

  • 7/27/2019 The Danger of Big Data - Social Media as Computational Social Science

    3/15

    10/5/13 Oboler

    firstmonday.org/ojs/index.php/fm/rt/printerFriendly/3993/3269 3

    individuals can be profiled, and the results used to better manipulate them. This may be as harmlessas advertising for a particular product, or as damaging as political brainwashing. At the intersectionof these examples, computational social science can be used to guide political advertising; peoplecan be sold messages they will support and can be sheltered from messages with which they maydisagree. Access to data may rest with the incumbent government, with those able to pay, or withthose favoured by powerful datarich companies.

    Under its new terms of service, Google could for instance significantly influence an election bypredicting messages that would engage an individual voter (positively or negatively) and thenfiltering content to influence that users vote. The predictions could be highly accurate making use ofa users email in their Google prov ided Gmail account, their search history, their Google+ updatesand social network connections, and their online purchasing history through Google Wallet, data intheir photograph collection. The filtering of information could include recommended videos inYouTube; videos se lectively chosen to highlight where one political party agrees with the usersviews and where another disagrees with them. In Google News, articles could be given higher orlower visibility to help steer voters into making the r ight choice.

    Such manipulation may not be immediately obvious; a semblance of balance can be given with anequal number of positive and negative points made against each party. What computational socialscience adds is the ability to predict the effectiveness of different messages for different people. Amessage with no resonance for a particular voter may seem to objectively to provide balanced,while in reality making little impact. Such serv ices could not only be sold, but could be used bycompanies themselves to block the election of officials whose agenda runs contrary to theirinterests.

    The ability to create such detailed profiles of individuals extends beyond the democratic process.

    The risk posed by the ubiquity of computational social science tools, combined with an everincreasing corpus of data, and free of the ethical restrictions placed on researchers, poses seriousquestions about the impact that those who control the data and the tools can have on society as awhole. Traditionally, concerns about potential abuses of power focus on government and how itspower can be limited to protect individuals; that focus needs to widen.

    Computational social science, for good or ill, is limited by the availability of data. Issues surroundingthe acquisition of data that can feed computational social science, and issues of control over accessto that data are key areas of public policy. By limiting data acquisition, sharing, and use, and byraising public awareness of the implications of its availability, there is a chance the ethicalimplications may be considered before the kind of privacy horror stories that are today relativelyrare become more commonplace. Computational social science can be a great benefit in our searchfor knowledge, but like all scientific advances, we must be aware of its risks.

    The use of computational social science in the social media sphere

    If an employer looks at an employees Facebook wall, is that an application of computational socialscience? Is Facebook itself a computational social science tool? Is adtargeting based on browsinghabits or personal information from other applications a form of computational social science? Wesee these examples as every day uses of social mediabased computational social science.

    Social media systems contain particularly valuable information. This data derives its value from itsdetail, personal nature, and accuracy. The semipublic nature of the data means it is exposed toscrutiny within a users network; this increases the likelihood of accuracy when compared to datafrom other sources. The social media data stores are owned and controlled by private companies.Applications such as Facebook, LinkedIn, and the Google suite of products, (including Google search,YouTube, DoubleClick and others), are driven by information sharing, but monetized throughinternal analysis of the gathered data a form of computational social science. The data is used byfour classes of users: business clients, government, other users within the social media platform,and the platform provider itself.

    Business clients draw on this computational social science when they seek to target theiradvertisements. Facebook, for example, allows advertisers to target users based on variables thatrange from standard demographics such as age, gender, and geographical location to more personalinformation such as sexual preferences. Users can also be targeted based on interests, associations,education level and employer. The Facebook platform makes this data (in aggregated form)available to advertisers for a specific purpose, yet Facebooks standard user interface can also beused as a general computational social science tool for other purposes.

    To take an example, the Australian Bureau of Statistics (ABS) estimates the current population ofAustralia at 22.5 million (Australian Bureau of Statistics, 2010a). The Facebook advertising platformgives an Australia population (on Facebook) of 9.3 million; over 41 percent of the national

  • 7/27/2019 The Danger of Big Data - Social Media as Computational Social Science

    4/15

    10/5/13 Oboler

    firstmonday.org/ojs/index.php/fm/rt/printerFriendly/3993/3269 4

    population. As there is less coverage at the tails, Facebook has only 0.29 million people over 64,while the ABS says there are 3.06 million Australians over 65 (Australian Bureau of Statistics,2010b), the sample for some age ranges must be approaching the entire population and mayprovide a very good model as a computational social science tool. For example, research shows thatabout two percent of the Australia population is not heterosexual (Wilson, 2004). From the Facebookadvertising platform, we can readily selection a population of Australians, aged 18 to 21, who aremale, and whose sexual preference is for men. The platform immediately tells us the population sizeis 11,580 people. By comparing this to the total size of the Australian male Facebook population whoexpressed a sexual preference, we can see this accounts for 2.89 percent of this population,indicating that the data available to Facebook is of similar utility to that available to social scientistsfor research.

    The second class of users of social media as computational social science tools is governmental. Thisis demonstrated by the U.S. governments demands to Twitter (via court orders) for data onWikileaks founder Julian Assange and those connected to him. The court order was only revealedafter Twitter took legal action to lift a court imposed censorship order relating to the requests(Dugan, 2011). The Wikileaks affair demonstrates how government can act when it sees socialmedia as acting against its interests.

    The very existence of social media can also promote governments agenda. During the Iranianelections, for example, Twitter was asked not to take their service offline for scheduledmaintenance (Musgrove, 2009). In another example, the U.S. State Department provided training

    using the Internet to effect social change to Egyptian dissidents between 2008 and 2010, thensought (unsuccessfully) to keep social media access available during the January 2011 Egyptianantigovernment protests (Morrow, 2011). The Egyptian effort was defeated after Egypt respondedby taking the entire country off the Internet, a move perhaps more in response to the U.S. than the

    protestors. While social media might enable activism, computational social science favours the stateor at least those with power. Computational social science tools combined with social media data canbe used to reconstruct the movements of activists, to locate dissidents, and to map their networks.Governments and their security services have a strong interest in this activity.

    The third class of actors are other social media platform users. Journalist Ada Calhoun has describedas an epiphany that left her freaked out the realisation that anyone could research her just as sheresearched others while writing their obituaries. In her article, Calhoun reflected that some amateurexperts on the anarchic message board 4chan, or professional experts working for governmentagencies, could likely find out far more than she could (Calhoun, 2011). The everyday danger thatcan result when anyone can research anyone else can be demonstrated through two scenarios:

    Scenarios one involves Mary who has been a Facebook userfor some years. Through Facebook Mary reconnected with

    an old friend Fred. As time went on, Mary and Fred grewcloser and became a couple. One day Mary logged into herFacebook account and noticed that Fred has still notupdated his details to say he is in a relationship with her.This makes Mary feel very insecure, and causes her tobegin doubting Freds intentions. Due to this discovery,Mary broke off her relationship with Fred.

    Joe applied to a company as a Human Resource teamleader. The hiring manager, Bob, found Joes resumeappealing and considered him a good candidate. Bobdecides to check Joes Facebook information. On Joespublically viewable wall, Bob sees several pictures of Joe inwhat Bob considers to be questionable settings. Thecompany never called Joe for an interview. Joe has been

    given no opportunity to explain, nor any explanation on whyhis application was rejected.

    Both Mary and Bob used Facebook as a computational tool to extract selected information as part ofan investigation into the social dynamics of society, or in these cases, a particular individualsinteractions with society. In this sense, Facebook could be considered a computational social sciencetool. Marys inference may be based on a wider realisation that Freds interactions with her are all inprivate and not part of his wider representation of himself. Bob may have drawn his conclusionsfrom a combination of text, pictures, and social interactions.

    These situations are far from hypothetical. Research released in November 2011 by Telstra,Australias largest telecommunications company, revealed that over a quarter of Australian bosseswere screening job candidates based on social media (Telstra, 2011). At the start of 2012 theAustralia Federal Police began an advertising campaign designed to warn the public of the need to

  • 7/27/2019 The Danger of Big Data - Social Media as Computational Social Science

    5/15

    10/5/13 Oboler

    firstmonday.org/ojs/index.php/fm/rt/printerFriendly/3993/3269 5

    protect their reputation online. The advertisement featured a job interview where the interviewerconsults a paper resume then proceeds to note various positive attributes about the candidate; allseems to be going very well. The interviewer then turns to his computer screen and adds and I seefrom your recent online activity you enjoy planking from high rise buildings, binge drinking, andposting embarrassing photos of your friends online (Australian Federal Police, 2012). Theadvertisement is an accurate picture of the current approach, which takes place at the level of oneuser examining another. Computational social science may soon lead to software programs thatautomatically complete preselection and filtering of candidates for employment.

    The final class of actor we consider are social media platform providers themselves. While Facebookprovides numerous metrics to profile users for advertisers, far more data and scope for analysis isavailable to a platform provider like Facebook itself. Internet advertisements are often sold on a

    cost perclick (CPC) or cost perimpression (CPM with M indicating costs typically conveyedperthousand impressions). Thus, Facebook may maximise advertising revenue by targetingadvertisements to achieve the greatest possible number of clicks for a given number of impressions.This maximisation of the clickthrough rate (CTR) can be achieved using a wealth of hiddeninformation to model which users are most likely to respond to a particular advertisement.Computational science can help a company like Facebook correctly profile its users, showing theright advertisements to the right people so as to maximize revenue. But what else can a companylike Facebook or Google do? This depends on the data they hold.

    The aggregation of data brings additional risk

    While horizontal expansion of computational social science allows greater access to selectedaggregate data, vertical expansion allows larger operators to add depth to their models. This depthis a result of triangulation, a method originally from land surveying. Triangulation gives aconfirmation benefit by using additional data points to increase the accuracy and confidence in ameasurement. In a research context triangulation allows for information from multiple sources to becombined in a way that can expose underlying truths and increase the certainty of conclusions(Patton, 1990).

    Social media platforms have added to their data either by acquiring other technology companies, asGoogle did when acquiring DoubleClick and YouTube, or by moving into new fields as Facebook didin when it created Facebook Places: a foursquarelike geolocation service (McCarthy, 2010). Froma computational social science perspective, geolocation services in particular add high valueinformation. Maximising the value of information requires a primary key that connects this data withexisting information; a Facebook user ID, or a Google account name provides just such a key.

    The breadth of an accountmeasures how many types of online interaction the one account connects.It lets the company providing the account know about a wider slice of a users life. Three situationsare possible. The first involves distinct accounts on multiple sites and allows no overlap of data:what occurs on one site stays on that site. The second situation is where there is a single traceablelogin, for example your email address, which is used on multiple sites but where the sites areindependent. Someone, or some computational social science tool, with access to the datasets couldaggregate the data. The third possibility is a single login with complete data sharing between sites.All the data is immediately related and available to any query the underlying company devises. It isthis last scenario that forms the Holy Grail for companies like Facebook and Google, and causes themost concern for users.

    The announcement by Alma Whitten, Googles Director of Privacy, Product and Engineering inJanuary 2012 that Google would aggregate its data and treat you as a single user across all ourproducts (Whitten, 2012), has led to a sharp response from critics. Jeffrey Chester, executivedirector of the Center for Digital Democracy told the Washington Post: There is no way a user cancomprehend the implication of Google collecting across platforms for information about your health,political opinions and financial concerns (Kang, 2012). In the same article, Common Sense Mediachief executive James Steyer states bluntly that Googles new privacy announcement is frustratingand a little frightening.

    The depth of an accountmeasures the amount of data an account connects. There are threepossible situations. The first is an anonymous login with no connection to personal details, the virtualprofile is complete in and of itself it may or may not truthfully represent the real world. Thesecond situation is an account that where user details are verified, for example a university loginthat is only provided once a student registers and identification papers have been checked. Anumber of online services and virtual communities are now using this model and checkinggovernment issued identification to verify age (Duranske, 2007). The third situation involves anaccount that has a verified identity aggregated with other data collected from additional sources, forexample, a credit card provider knows who its customers are, as well as where they have been and

  • 7/27/2019 The Danger of Big Data - Social Media as Computational Social Science

    6/15

    10/5/13 Oboler

    firstmonday.org/ojs/index.php/fm/rt/printerFriendly/3993/3269 6

    what they have bought. The temporal nature of the data is also a matter of depth; your currentrelationship status has less depth than your complete relationship history.

    Facebooks Timeline feature signifies as large a change to depth as Googles policy change does tobreadth. Timeline lets users quickly slide to a previous point in time, unearthing social interactionsthat had long been buried. A Facebook announcement on 24 January 2012 informed the world thatTimeline was not optional and would in a matter of weeks be rolled out across all Facebook profiles(McDonald, 2012).

    As Sarah Jacobsson Purewal noted in PC World, with Timeline it takes only a few clicks to see datathat previously required around 500 clicks on the link labelled older posts, each click separated bya few seconds delay while the next batch of data loads (Purewal, 2012). Purewal (2012) provides astepbystep guide to reasserting privacy under the new timeline regime, the steps are numerousand the ultimate conclusion is that you may want to just consider getting rid of your Facebookaccount and starting from scratch. Though admittedly not scientific, a poll by Sophos, an IT securityand data protection company, showed that over half those polled were worried about Timeline(Cluley, 2012a). The survey included over 4,000 Facebook users from a population that is likely bothmore concerned and more knowledgeable about privacy and security than the average user. If thatwasnt telling enough, the author of the announcement, Sophos senior technology consultant,Graham C luley, announced in the same article that he had shutdown his Facebook account. Cluleysreasoning was a response to realising exactly how much of his personal data Facebook was holding,and fatigue at Facebooks ever changing and nonconsultative privacy regime (Cluley, 2012a;2012b).

    All accounts have both a breadth and a depth. Accounts that are identityverified, frequentlyupdated, and used across multiple aspects of a persons life present the richest data and pose the

    greatest risk. The concept of a governmentissued national identity card has created fierce debatein many countries, yet that debate has been muted when the data is collected and held by nongovernment actors. Googles new ubiquitous account and Facebooks single platform for all forms ofsocial communication should raise similar concerns for individuals as both consumers and citizens.

    A lesson from the social sciences

    The rise of social media, with social science capabilities, has placed technology professionals in adecision making role over new ethical dilemmas. While ethical controversies are well known in boththe technology field and the social sciences; the nature of the issues can, however, be different. Inaddition to a greater understanding of the ethical codes that apply to their own discipline, todaystechnology professionals in the social media space need an appreciation of the ethics of social

    science.

    In 1969 a doctoral candidate at Harvard, Laud Humphreys, created one of the largest ethicalcontroversies in social science. Constance Holden, writing in Science on ethics in social scienceresearch, described Humphrey as having deceived his subjects, failed to get anything remotelyresembling informed consent from them, lied to the Bureau of Motor Vehicles, and risked doinggrave damage to the psyches and reputations of his subjects. Humphrey had chosen subjects,without their consent, and then collected and arrogated data about them. His data was collectedmultiple times, in multiple different guises, and without informing them of his true purpose (Holden,1979). His experiment, which examined the behaviour of homosexuals, led to a book entitledTearoom trade (Humphreys, 1970) which aimed to demonstrate that homosexuals were regularpeople and not a danger to society.

    Today, research like Humphreys would by necessity include an element of computational socialscience. Indeed, C alhoun (2011) details how she engaged in just such research when writing thestory of Tyler Clementi, a gifted teenage violinist who committed suicide after a sexual encounterwith another man in his dorm room was allegedly streamed over the Internet.

    In discussing the ethics of social science research, Holden noted two schools of thought:utilitarianism (also known as consequentialism) holds that an act can only be judged on itsconsequences; deontologicalism (also known as nonconsequentialism) is predominantly aboutabsolute moral ethics. In the 1960s utilitarianism was dominant, along with moral relativism; in thelate 1970s deontologicalism began to hold sway (Holden, 1979). In computational social science, thedebate seems to be academic with little regard given to ethics. Conditions of use are typically onesided without user input, although Wikipedia is a notable exception (Konieczny, 2010). Companiesexpand their services and data sets with little regard for ethical considerations, and market forces inthe form of user backlashes from the first, and often only, line of resistance.

    One such backlash occurred over Facebooks Beacon software, which was eventually cancelled aspart of an out of court settlement. Beacon connected peoples purchases to their Facebook account;

  • 7/27/2019 The Danger of Big Data - Social Media as Computational Social Science

    7/15

  • 7/27/2019 The Danger of Big Data - Social Media as Computational Social Science

    8/15

    10/5/13 Oboler

    firstmonday.org/ojs/index.php/fm/rt/printerFriendly/3993/3269 8

    that this finegrained ability, which allows users to instruct a Web site not to use a specific piece ofpersonal data for a specific purpose, represents a significant improvement on the current personaldata freeforall model used by both social networking companies and their corporate customers.

    The government as protector of consumer rights

    With the increasing amount of personal information shared by users of social networking platforms,and a tendency for the data to be stored indefinitely, there is a strong need for consumer protection.

    The rights of users need to be protected while minimizing the impact on both platform providers andother companies who are seeking to provide new innovative services.

    Ethics, as an academic discipline, suggests two very different approaches to address the balancebetween the protection of users and the freedom of companies. The deontological approach judgesactions based on their adherence to rules. The rules may be seemingly arbitrary; a contravention,which may do no real harm, might be treated very seriously. This is the approach adopted whenroads are given speed limits. The rule is enshrined in law and has strict liability. The alternativeethical approach is consequentialism, a virtueoriented approach which judges actions based ontheir ultimate impact. This is the approach adopted in negligence cases, where the wrong mustcause damage before it becomes actionable.

    The consequentialist approach would give companies more freedom, but also greater liability. Itdoes little to protect consumers from preventable harm; as it is impossible to predict the future useof personal data, or the consequences that may result, that harm may be significant. The

    deontological approach would place the burden on social media sites to restrict the storage, retrievaland manipulation of data in ways that limit its usefulness. This would prevent abuse but would alsolimit innovation. The introduction of such regulation could have a particularly stifling effect on newmarket entrants.

    A consequentialist approach has held sway as the social media industry has developed, but thepublic is increasingly looking to government to safeguard personal rights and freedoms. Regulationrequires a deontological approach. This section explores some of the difficulties faced by regulatorsseeking to protect users privacy without placing an undue burden on the nascent industry.

    The fundamental premise of a social networking sites business model is that users provide personalinformation and content, such as pictures, that they have created. In return for facilitating a sharingof content between users, the platform displays targeted advertising. The targeting of adverts isbased on data mining of the information the platform holds. The system allows advertisers to targetadvertisements at users most likely to be interested in them, and allows the social networking

    website to charge a premium per advertisement view or click. Agreement to this operating model isthe basis of a social networking Web sites end user license agreement (EULA), which every useragrees to upon joining. However, what happens when a user decides they no longer wish toparticipate? What if the user would prefer the social networking site removed the personalinformation previously handed over? What if the user lied about their age, or was otherwise unableto enter into the EULA? What about people whose data was uploaded by third parties and who haveno privity of contract [1] with platform provider? How is a user supposed to make an informeddecision as to whether they would like their data removed if it is collected and used opaquely, as isthe case with Googles data collection? These are the questions regulators need to consider beforedrafting legislation, and so far they appear to have been largely overlooked.

    In late 2010, the European Commission announced a public consultation on personal data protectionin the European Union (European Commission, 2011). Several mainstream news outlets reportedthat the Commission was considering a right to be forgotten (BBC News, 2010; Warman, 2010).Such a right would require social networking sites to provide a mechanism by which users mayremove their profile and related information. Most existing social networking Web sites currentlyprovide this facility either of their own volition, or as a result of past public pressure over the issue.Facebook famously tried to avoid providing this facility, instead implementing what commentatorscalled the Hotel California policy (Williams, 2007; M.G., 2010), whereby users could deactivate, butnot remove their profiles. This was reversed in 2008 after significant public pressure (Williams,2008). Even so, the effort required to delete an account is significant. Facebook has implementedmultiple strategies to push users into deactivating their account rather than deleting it (Cluley,2012b). Deactivation stops data being publically shared, but allows Facebook to continue to hold ituntil the user takes some action to interact with Facebook, at which point the account is revived.

    The ability to be forgotten is a blunt and indiscriminate tool. It does little to allow users control overtheir personal information: it merely grants users a right to end the agreement into which theyentered on joining the social network. Users information is removed, at the price of them beingunable to continue using the social networking service. Furthermore, with social networking Websites offering authorisation services to independent sites (OAuth, 2011), a forgotten user loses the

  • 7/27/2019 The Danger of Big Data - Social Media as Computational Social Science

    9/15

    10/5/13 Oboler

    firstmonday.org/ojs/index.php/fm/rt/printerFriendly/3993/3269 9

    ability to access these thirdparty sites with the possibility of yet more personal information on thethirdparty sites becoming orphaned in the process.

    We argue that a right to be forgotten as proposed by the European Commission is too coarsegrained, leaving users with a Hobsons Choice [2] between allowing the continued retention of alltheir personal data, or losing access to what may potentially be a large proportion of their onlineexperience. More finegrained controls, allowing users to remove specific pieces or clusters ofpersonal information, without affecting their ability to use social networking sites (save for anyinevitable consequence of the informations unavailability), are essential if users are to be givengenuine control over their personal data. This fundamental requirement cannot be fulfilled withoutsignificant implementation difficulty, but without it regulators initiatives will be ineffective.

    Personal information is also held by social networking sites on individuals who do not, and havenever used the service or agreed to a EULA. Facebook, for example, allows people to be identified(tagged) by name in images even if they have no Facebook account. These pictures are typicallytaken and uploaded by those known to the individual, and the social networking site will also containinformation on when and where the picture was taken, what event was being held, and which otherFacebook members attended. Facebook developers have previously shown interest in developing atagbased image search (Facebook Photo tagged searches, n.d.), and facialrecognition basedimage searching is also becoming available (Face.com, 2012). Both features allow a profile of anindividuals past movements, activities and social interactions to be built despite the individualhaving never granted any consent for any of this information to be held or used. C learly, any rightto be forgotten must extend to all people on which data is held by a social networking site, and not

    just to registered users.

    Even if regulators arrive at an effective regulatory framework that affords individuals the right to

    remove data held on them by social networking sites, unilateral lawmaking on the part of a singlejurisdiction has rarely proven effective on the Internet (Goldsmith, 2000; Benkler, 2000). The verynature of the Internet means that Web sites may chose the physical location of their hostinginfrastructure freely. They are free to practice jurisdiction shopping and to choose to operate from a

    jurisdiction of least regulation. To enforce effective control on social networking sites, law makersare faced with the difficult task of legislating multilaterally.

    The government as a user of data

    Governments are not just the guardians of the rights of citizens. They are also heavy users of data,with their own strong interests in computational social science. Not only do they have vast amountsof government data, but any data held by a company operating within a governments jurisdiction

    must be disclosed when the government issues a lawful request.

    Most countries balance the states desire to access information, for example for law enforcementand national security reasons, with the citizens right to privacy. Law enforcement agencies mustusually show reasonable cause to a judge before they can obtain a warrant to search premises, ordemand a company disclose customer information. While the precise rules and procedures vary,safeguards to prevent abuse of government power are usually strong in western democracies. Arare exception, and the resulting public backlash, can be seen in recent moves in the U.K. to allowwarrantless realtime access by police to all Internet connection information (Whitehead, 2012). Theproposal has seen Britain compared to China and Iran.

    The ability of government to demand personal information held within its jurisdiction is not new: theproblem lies with the increase in the amount of data that is available, and the ability ofcomputational social science to process and make sense of it. Advanced computational socialscience tools can be used by oppressive regimes for increased surveillance, target dissidents, anderode civil rights. The questionable use by liberal democracies of similar powers legitimizes theproblematic use by these regimes.

    The U.S. has also made questionable demands for personal data. Worse still, the demands havebeen made in secret. For example, Twitter was ordered to hand over details of the users associatedwith the whistleblowing site Wikileaks, as well as details related to accounts that followed Wikileaks(BBC News, 2011). The move was particularly unusual given that the action Twitter was involvedwith, the act of publication, is protected under the First Amendment (Elsea, 2010). Twitter had thedata, and the government wanted it.

    Dutch hacker Rop Gonggrijp noted that, it appears that Twitter, as a matter of policy, does the rightthing in wanting to inform their users when access to their data is demanded (Dugan, 2011). Thiswas no small thing as Twitter itself had to fight a suppression order through the courts. Gonggrijp,whose information has been provided to the U.S. government in relation to Wikileaks, wondered howmany other social media platforms received similar subpoenas and handed over data without efforts

  • 7/27/2019 The Danger of Big Data - Social Media as Computational Social Science

    10/15

    10/5/13 Oboler

    firstmonday.org/ojs/index.php/fm/rt/printerFriendly/3993/3269 10

    to enable disclosure to their users (Dugan, 2011). The lack of response from Google and Facebookto multiple media enquiries on the topic raises serious questions.

    Once gathered by a technology company, personal information can be obtained by any governmentwith sufficient jurisdiction. The government use of such information is distinct from the datasintended use by other classes of users. For government, the availability of computational socialscience tools, and the ability to potentially access vast amounts of private data, is a potent mix. Thisis of particular concern to civil rights activists and others whom government may deem subversive.The very government seeking to silence criticism also gets to determine the regulation governingthe social networking providers duty to disclose information. Even sites operating outside of agovernments jurisdiction may be forced to cooperate under threat of government sanctions such asa denial of access to their market or the risk of an intergovernmental incident.

    Platform providers and the ethical development of social media

    The lack of importance given to ethical consideration is largely a result of the interdisciplinary natureof computational social science. Traditionally, computer science research involves minimal collectionof data on individuals. The ethical barriers to experimentation are low. The social sciences, bycontrast, are primarily focused on personal information. Ethical considerations play a larger role inthe social sciences and data sets tend to be more limited in size and must be explicitly collected.Social scientists are not more ethical than computer scientists or engineers, but they are likely tohave more relevant training and experience when it comes to managing personal data.

    Professional engineering and computing bodies have relevant codes of ethics. The IEEE Code ofEthics (IEEE, 2012a) commits members to improve the understanding of technology as well as itsappropriate application, and potential consequences. TheACM Code of Ethics and ProfessionalConduct(ACM, 1992) commits members to avoid harm to others, respect the privacy of others,and improve public understanding of computing and its consequences. A joint IEEE ComputerSociety/ACM Code of Ethics for Software Engineers, created in 1999 (IEEE, 2012b), states that,

    software engineers shall act consistently with the public interest, and that, Software engineers shallact in a manner that is in the best interests of their client and employer, consistent with the publicinterest (Lazer, et al., 2009).

    The problem then is not a lack of ethical guidelines, but rather a lack of relevant application of theseprinciples when it comes to social media. Relevant application begins with an understanding thatsocial media platforms are a form of computational social science. Once that point is realised, socialmedia technology professionals need to be informed by social sciences ethical concerns, andspecifically by the concerns raised by computational social science. Cross disciplinary teams can be

    used not only to assist development, but also to assist in the ethical considerations of new projects.Both computer scientists and social scientists need to contribute to the discussion, together newangles can be considered and problems can be avoided. Good design will not solve all the problems,but it can reduce the opportunity for the abuse of a very powerful tool.

    Platform users and the ethical use of social media

    Users of social media have an ethical responsibility to one another; both education and culturalchange is needed. An information producer code of ethics could promote the required change inonline society. Such a code could highlight the issues to be considered when publishing information.For example, when a Facebook user uploads photographs, their action may reveal information about

    others in their network; the impact on those other people should be considered under a producerscode of ethics. A consumer code of ethics is also needed; such a code would cover users viewinginformation posted by others through a social media platform. A consider code could raise questionsof when it is appropriate to further share information, for example by reposting it. Producer andconsumer are the most basic roles in a social media platform; more specialised types of role, withtheir own consideration both as producers and consumers, can be derived from a wide variety ofrealworld relationships.

    Specialised roles range from the everyday, such as a parent and child, or employer and employee,to the exceptional, for example credit and insurance companies and their clients. In establishing theboundaries of what is ethically permissible, regard should be had for the existing boundaries insociety and to existing levels of regulation. People in some relationships should be prevented fromusing computational social science at the individual level altogether, health insurance companies arean example.

  • 7/27/2019 The Danger of Big Data - Social Media as Computational Social Science

    11/15

    10/5/13 Oboler

    firstmonday.org/ojs/index.php/fm/rt/printerFriendly/3993/3269 1

    Platform providers need to supplement existing terms of service by providing clearer guidelines tohelp users determine what they are publishing, to avoid a beacon like situation, and to alert them tothe potential impact of publishing information. A Principles of Engagement (POE) document couldbe developed to provide guidance, and the power of social media itself can be used to let those whocan see content warn the owner when the content may pose a r isk. General recommendations canalso be useful, for example, that users limit themselves to fair comment when access is open tomore than personal friends. Another general suggestion is limiting photographs to those each of thesubjects in the photograph would be happy their parents or employers seeing. Such guidelines canhelp users avoid potential future issues for themselves or others.

    Ethics on the consumer side relate to how information is accessed and how the information gained itis used. We need a cultural mind shift to become more forgiving of information exposed throughsocial media; or an acceptance that social media profiles are private and must be locked down withever more complex filters; or an acceptance that even if available, information from social mediashould not be used in certain settings. These approaches would each change the nature of socialmedia as a computational social science tool; they would move some aspects in and others out ofthe tools field of observation. As an instrument based discipline, the way the field is understood canbe changed either by changing the nature of the tool, or by changing the way we allow it to be used.

    Conclusion

    Social media platforms are collecting vast amounts of data and are functioning as tools forcomputational social science. The most visible example, Facebook, provides relatively open access

    to potentially sensitive personal information about individual users; in less visible examples, such asthe dataset created by Google, data is held and used opaquely. The large scale collection, retention,aggregation, use and disclosure of detailed, triangulated personal information offers the possibility ofincredibly powerful computational social science tools, but brings with it the potential for abuse bygovernments and private entities.

    Social networking has made available a r ich and wide ranging dataset covering large sections of thepopulation. Even at this current nascent stage, the social networking industry offers users,researchers and governments access to a powerful ability to identify trends in behaviour amongst alarge population, and to find vast quantities of information on an individual user. As the industrydevelops, it would be reasonable to expect that these abilities will increase in scope, accuracy andusefulness. Future development may be driven by technological progress developing better tools, orby natural expansion in the underlying datasets. Data expansion will inevitably include greateraggregation of data as company acquisitions merge previously discrete datasets. Society is only justbeginning to consider the possible privacy and ethical implications of this amount of personalinformation being so readily available, and at present there are few ethical or regulatory barriersrestricting the collection, retention and use of personal information.

    We have shown that the data collected by social networking sites can be useful to social scientists,and that the sites themselves can be viewed as computational social science tools. As a researchtool, social networking data offers considerable breadth and depth, but offers limited coverage ofsome groups, e.g., the elderly.

    We have discussed the legal and ethical implications of social networking as a computational socialscience tool. We have argued that computational social science, as an interdisciplinary approach,must apply the broad ethical considerations adopted by computer and engineering professionalbodies in a manner consistent with, and informed by, the ethics of traditional social scienceresearch. As part of this, guidelines and regulation that sets limits on the collection, retention, useand disclosure of personal information are needed for end users.

    We have argued that the type of regulation currently being considered will fall well short of providingan acceptable level of protection for individuals, and that despite the additional burden placed onsocial networking site operators, a more finegrained approach must be developed. Users should beable to remove individual pieces of data they no longer wish a social media company, or its users, tobe able to access. Recent developments by the titans in the social media landscape, Facebook andGoogle, suggest the risk to the public is going to rise until regulation intervenes.

    About the authors

    Andre Oboler is CEO of the Online Hate Prevention Institute and a postgraduate law student atMonash University. He holds a Ph.D. in computer science from Lancaster University and completed apostdoctoral fellowship in political science at BarIlan University.Web: www.oboler.com

    http://www.oboler.com/
  • 7/27/2019 The Danger of Big Data - Social Media as Computational Social Science

    12/15

    10/5/13 Oboler

    firstmonday.org/ojs/index.php/fm/rt/printerFriendly/3993/3269 12

    Email: andre [at] Oboler [dot] com

    Kristopher Welsh is a lecturer in the School of Computing at the University of Kent. He holds aPh.D. in computer science from Lancaster University.Web: www.kriswelsh.comEmail: K [dot] Welsh [at] kent [dot] ac [dot] uk

    Lito Cruz is a teaching associate at Monash University and a parttime lecturer at Charles SturtUniversity. He holds a Ph.D. in computer science from Monash University.Web: www.sleekersoft.comEmail: lcruz [at] sleekersoft [dot] com

    Notes

    1. Privity of contract is a rule of contract law which holds that only the parties to a contract arebound by the contract and only they can enforce the contract. The rule prevents the burden of acontract falling on someone who had no part in the act of agreeing to accept the contract; Tweddle vAtkinson [1861] 121 ER 762 (see also http://en.wikipedia.org/wiki/Tweddle_v_Atkinson ).

    2.http://en.wikipedia.org/wiki/Hobsons_choice .

    References

    Association for Computing Machinery (ACM), 1992. ACM code of ethics and professional conduct,at http://www.acm.org/about/code-of-ethics , accessed 30 January 2011.

    Australian Bureau of Statistics, 2010a, 3101.0 Australian Demographic Statistics Jun, 2010, athttp://www.abs.gov.au/AUSSTATS/[email protected]/DetailsPage/3101.0Jun%202010 , accessed 30 January2011.

    Australian Bureau of Statistics, 2010b, 3201.0 Population by Age and Sex, Australian States andTerritories Jun 2010, at http://www.abs.gov.au/Ausstats/[email protected]/mf/3201.0, accessed 30 January2011.

    Australian Federal Police, 2012. Protecting your reputation online, at http://www.afp.gov.au/AFP-Homepage/policing/cybercrime/crime-prevention.aspx, accessed 10 January 2012.

    BBC News, 2011. US wants Twitter details of Wikileaks activists (8 January), athttp://www.bbc.co.uk/news/world-us-canada-12141530 , accessed 30 January 2011.

    BBC News, 2010. EU wants right to be forgotten online (4 November), athttp://www.bbc.co.uk/news/business-11693026, accessed 28 January 2012.

    Yochai Benkler, 2000. Internet regulation: A case study in the problem of unilateralism,EuropeanJournal of International Law, volume 11, number 1, pp. 171185.

    blue_beetle, 2010. Userdriven discontent,MetaFilter(26 August), athttp://www.metafilter.com/95152/Userdriven-discontent#3256046, accessed 20 May 2012.

    Ada Calhoun, 2011. I can find out so much about you,Salon (18 January), athttp://www.salon.com/life/internet_culture/?story=/mwt/feature/2011/01/18/what_i_can_find_online, accessed 30 January 2011.

    Claudio CioffiRevilla, 2010. Computational social science,Wiley Interdisciplinary Reviews:

    Computational Statistics, volume 2, issue 3, pp. 259271.

    Graham Cluley, 2012a. Poll reveals widespread concern over Facebook Timeline,Naked Security(27 January), at http://nakedsecurity.sophos.com/2012/01/27/poll-reveals-widespread-concern-over-facebook-timeline/, accessed 10 February 2012.

    Graham C luley, 2012b. Why I left Facebook,BBC College of Journalism blog (10 January), athttp://www.bbc.co.uk/journalism/blog/2012/01/why-i-left-facebook.shtml , accessed 10 February2012.

    Emily Dugan, 2011. US demands Twitter release Assange details,Independent(9 January), athttp://www.independent.co.uk/news/world/americas/us-demands-twitter-release-assange-details-2179740.html, accessed 30 January 2011.

    Benjamin Duranske, 2007. IMVU deploys thirdparty age verification solution (24 September), at

    http://www.independent.co.uk/news/world/americas/us-demands-twitter-release-assange-details-2179740.htmlhttp://www.bbc.co.uk/journalism/blog/2012/01/why-i-left-facebook.shtmlhttp://nakedsecurity.sophos.com/2012/01/27/poll-reveals-widespread-concern-over-facebook-timeline/http://www.salon.com/http://www.metafilter.com/95152/Userdriven-discontent#3256046http://www.bbc.co.uk/news/business-11693026http://www.bbc.co.uk/news/world-us-canada-12141530http://www.afp.gov.au/AFP-Homepage/policing/cybercrime/crime-prevention.aspxhttp://www.abs.gov.au/Ausstats/[email protected]/mf/3201.0http://www.abs.gov.au/AUSSTATS/[email protected]/DetailsPage/3101.0Jun%202010http://www.acm.org/about/code-of-ethicshttp://en.wikipedia.org/wiki/Hobson's_choicehttp://en.wikipedia.org/wiki/Tweddle_v_Atkinsonhttp://www.sleekersoft.com/http://www.kriswelsh.com/
  • 7/27/2019 The Danger of Big Data - Social Media as Computational Social Science

    13/15

    10/5/13 Oboler

    firstmonday.org/ojs/index.php/fm/rt/printerFriendly/3993/3269 13

    http://virtuallyblind.com/2007/09/24/imvu-age-verifiction/ , accessed 28 May 2012.

    Jennifer K. Elsea, 2010. Criminal prohibitions on the publication of classified defense information,Congressional Research Service (10 September), athttp://fpc.state.gov/documents/organization/148793.pdf, accessed 30 January 2012.

    European Commission, 2011. Consultation on the Commissions comprehensive approach onpersonal data protection in the European Union, athttp://ec.europa.eu/justice/news/consulting_public/news_consulting_0006_en.htm , accessed 10February 2012.

    Face.com, 2012, at http://face.com/, accessed 10 March 2012.

    Facebook Photo tagged searches, athttp://www.catchmeifyouknowhow.com/index.php/tutorials/social-media/item/facebook-photo-tagged-searches, accessed 30 January 2012.

    Jack Goldsmith, 2000. Unilateral regulation of the Internet: A modest defence,European Journal ofInternational Law, volume 11, number 1, pp. 135148.

    Kashmir Hill, 2012. How Target figured out a teen girl was pregnant before her father did,Forbes(16 February), at http://www.forbes.com/sites/kashmirhill/2012/02/16/how-target-figured-out-a-teen-girl-was-pregnant-before-her-father-did/, accessed 20 May 2012.

    Constance Holden, 1979. Ethics in social science research,Science, volume 206, number 4418 (2November), pp. 537538.

    Laud Humphreys, 1970. Tearoom trade; Impersonal sex in public places. Chicago: Aldine.

    Institute of Electrical and Electronics Engineers (IEEE), 2012a. IEEE code of ethics, athttps://www.ieee.org/about/corporate/governance/p7-8.html , accessed 30 January 2011.

    Institute of Electrical and Electronics Engineers (IEEE), 2012b. FAQ and resources: Softwareengineering code of ethics and professional practice, athttp://www.computer.org/portal/web/certification/resources/code_of_ethics , accessed 17 June 2012.

    Cecilia Kang, 2012. Google announces privacy changes across products; users cant opt out,Washington Post(24 January), at http://www.washingtonpost.com/business/economy/google-tracks-consumers-across-products-users-cant-opt-out/2012/01/24/gIQArgJHOQ_story.html , accessed 28January 2012.

    Piotr Konieczny, 2010. Adhocratic governance in the Internet age: A case of Wikipedia,Journal of

    Information Technology & Politics, volume 7, number 4, pp. 263283.David Kravets, 2010. Judge approves $9.5 million Facebook Beacon accord,Wired(17 March), athttp://www.wired.com/threatlevel/2010/03/facebook-beacon-2/ , accessed 30 January 2011.

    David Lazer, Alex Pentland, Lada Adamic, Sinan Aral, AlbertLszl Barabsi, Devon Brewer,Nicholas Christakis, Noshir Contractor, James Fowler, Myron Gutmann, Tony Jebara, Gary King,Michael Macy, Deb Roy, and Marshall Van Alstyne, 2009. Computational social science,Science,volume 323, number 5915 (6 February), pp. 721723.

    M.G., 2010. Facebook and transparency: Facebook and the Hotel California,Economist(6October), at http://www.economist.com/node/21011590, accessed 30 January 2011.

    Philip Mai, 2012. If youre not paying for it, youre the product: What is the $value of social data?Social Media Lab (9 April), at http://socialmedialab.ca/?p=6076, accessed 20 May 2012.

    Caroline McCarthy, 2010. Facebook granted geolocation patent,CNet News (6 October), athttp://news.cnet.com/8301-13577_3-20018783-36.html, accessed 30 January 2011.

    Paul McDonald, 2012. Timeline: Now available worldwide (24 January), athttp://blog.facebook.com/blog.php?post=10150408488962131, accessed 28 January 2012.

    Adrian Morrow, 2011. U.S. officials backed rebels planning Egyptian uprising in 2008: WikiLeaks,Globe and Mail(28 January), at http://www.theglobeandmail.com/news/world/africa-mideast/us-officials-backed-rebels-planning-egyptian-uprising-in-2008-wikileaks/article1887439/ , accessed 30January 2011.

    Mike Musgrove, 2009. Twitter is a player in Irans drama,Washington Post(17 June), athttp://www.washingtonpost.com/wp-dyn/content/article/2009/06/16/AR2009061603391.html,accessed 30 January 2011.

    http://www.washingtonpost.com/wp-dyn/content/article/2009/06/16/AR2009061603391.htmlhttp://www.theglobeandmail.com/news/world/africa-mideast/us-officials-backed-rebels-planning-egyptian-uprising-in-2008-wikileaks/article1887439/http://blog.facebook.com/blog.php?post=10150408488962131http://news.cnet.com/8301-13577_3-20018783-36.htmlhttp://socialmedialab.ca/?p=6076http://www.economist.com/node/21011590http://www.wired.com/threatlevel/2010/03/facebook-beacon-2/http://www.washingtonpost.com/business/economy/google-tracks-consumers-across-products-users-cant-opt-out/2012/01/24/gIQArgJHOQ_story.htmlhttp://www.computer.org/portal/web/certification/resources/code_of_ethicshttps://www.ieee.org/about/corporate/governance/p7-8.htmlhttp://www.forbes.com/sites/kashmirhill/2012/02/16/how-target-figured-out-a-teen-girl-was-pregnant-before-her-father-did/http://www.catchmeifyouknowhow.com/index.php/tutorials/social-media/item/facebook-photo-tagged-searcheshttp://face.com/http://ec.europa.eu/justice/news/consulting_public/news_consulting_0006_en.htmhttp://fpc.state.gov/documents/organization/148793.pdfhttp://virtuallyblind.com/2007/09/24/imvu-age-verifiction/
  • 7/27/2019 The Danger of Big Data - Social Media as Computational Social Science

    14/15

    10/5/13 Oboler

    firstmonday.org/ojs/index.php/fm/rt/printerFriendly/3993/3269 14

    Ellen Nakashima, 2007. Feeling betrayed, Facebook users force site to honor their privacy,Washington Post(30 November), at http://www.washingtonpost.com/wp-dyn/content/article/2007/11/29/AR2007112902503.html, accessed 28 April 2012.

    Nathan Newman, 2011. Youre not Googles customer Youre the product: Antitrust in a Web 2.0world,Huffington Post(29 March), at http://www.huffingtonpost.com/nathan-newman/youre-not-googles-custome_b_841599.html, accessed 1 May 2012.

    OAuth, 2011. OAuth 2.0, at http://oauth.net/2/, accessed 30 January 2011.

    Andre Oboler, 2012. What market forces mean for Facebook,Jerusalem Post(24 May), athttp://blogs.jpost.com/content/what-market-forces-mean-facebook, accessed 25 May 2012.

    Andre Oboler, Gerald Steinberg and Rephael Stern, 2010. The framing of political NGOs inWikipedia through criticism elimination,Journal of Information Technology & Politics, volume 7,number 4, pp. 284299.

    Michael Patton, 1990. Qualitative evaluation and research methods. Second edition. Newbury Park,Calif.: Sage.

    Sarah Jacobsson Purewal, 2012. Facebook Timeline privacy tips: Lock down your profile,PC World(31 January), athttp://www.pcworld.com/article/249019/facebook_timeline_privacy_tips_lock_down_your_profile.html ,accessed 10 February 2012.

    Telstra, 2011. Aussies urged to consider their cyber C Vs as bosses head online (28 November), athttp://www.telstra.com.au/abouttelstra/media-centre/announcements/aussies-urged-to-consider-

    their-cyber-cvs-as-bosses-head-online.xml, accessed 5 December 2011.

    Matt Warman, 2010. EU proposes right to be forgotten,Telegraph (5 November), athttp://www.telegraph.co.uk/technology/internet/8112702/EU-proposes-online-right-to-be-forgotten.html, accessed 10 February 2012.

    Tom Whitehead, 2012. New powers to record every phone call and email makes surveillance 60mtimes worse,Telegraph (2 April), at http://www.telegraph.co.uk/technology/news/9180191/New-powers-to-record-every-phone-call-and-email-makes-surveillance-60m-times-worse.html, accessed28 April 2012.

    Alma Whitten, 2012. The Official Google Blog: Updating our privacy policies and terms of service(24 January), at http://googleblog.blogspot.com.au/2012/01/updating-our-privacy-policies-and-terms.html, accessed 24 January 2012.

    Christopher Williams, 2008. UK data watchdogs drop Facebook probe,Register(26 February), athttp://www.theregister.co.uk/2008/02/26/ico_facebook_investigation_complete/ , accessed 30January 2011. Christopher Williams, 2007. MicrosoftFacebook: Welcome to the Hotel C alifornia,Register(25 October), at http://www.theregister.co.uk/2007/10/25/microsoft_facebook_comment/ ,accessed 30 January 2011.

    Shaun Wilson, 2004. Gay, lesbian, bisexual and transgender identification and attitudes to samesex relationships in Australia and the United States,People and Place, volume 12, number 4, pp.1221.

    Editorial history

    Received 10 March 2012; revised 31 May 2012; accepted 13 June 2012.

    This paper is licensed under a Creative Commons AttributionNonCommercialNoDerivs 3.0Unported License.

    The danger of big data: Social media as computational social scienceby Andre Oboler, Kristopher Welsh, and Lito CruzFirst Monday, Volume 17, Number 7 - 2 July 2012http://firstmonday.org/ojs/index.php/fm/rt/printerFriendly/3993/3269doi:10.5210/fm.v17i7.3993

    http://creativecommons.org/licenses/by-nc-nd/3.0/http://creativecommons.org/licenses/by-nc-nd/3.0/http://www.theregister.co.uk/2007/10/25/microsoft_facebook_comment/http://www.theregister.co.uk/2008/02/26/ico_facebook_investigation_complete/http://googleblog.blogspot.com.au/2012/01/updating-our-privacy-policies-and-terms.htmlhttp://www.telegraph.co.uk/technology/news/9180191/New-powers-to-record-every-phone-call-and-email-makes-surveillance-60m-times-worse.htmlhttp://www.telegraph.co.uk/technology/internet/8112702/EU-proposes-online-right-to-be-forgotten.htmlhttp://www.telstra.com.au/abouttelstra/media-centre/announcements/aussies-urged-to-consider-their-cyber-cvs-as-bosses-head-online.xmlhttp://www.pcworld.com/article/249019/facebook_timeline_privacy_tips_lock_down_your_profile.htmlhttp://blogs.jpost.com/content/what-market-forces-mean-facebookhttp://oauth.net/2/http://www.huffingtonpost.com/nathan-newman/youre-not-googles-custome_b_841599.htmlhttp://www.washingtonpost.com/wp-dyn/content/article/2007/11/29/AR2007112902503.html
  • 7/27/2019 The Danger of Big Data - Social Media as Computational Social Science

    15/15

    10/5/13 Oboler