LabintheWild: How to Design Uncompensated, Feedback-driven ...

LabintheWild: How to DesignUncompensated, Feedback-drivenOnline Experiments

Nigini OliveiraComputer Science &Engineering, DUB groupUniversity of [email protected]

Katharina ReineckeComputer Science &Engineering, DUB groupUniversity of [email protected]

Eunice JunComputer Science &Engineering, DUB groupUniversity of [email protected] CroxsonComputer Science &Engineering, DUB groupUniversity of [email protected] Z. GajosSchool of Engineering andApplied SciencesHarvard [email protected]

Permission to make digital or hard copies of part or all of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for third-party components of this work must be honored.For all other uses, contact the Owner/Author.Copyright is held by the owner/author(s).CSCW ’17 Companion, February 25 - March 01, 2017, Portland, OR, USAACM 978-1-4503-4688-7/17/02.http://dx.doi.org/10.1145/3022198.3023267

AbstractWe present LabintheWild, an online experiment platformthat provides participants with the opportunity to learnabout themselves and compare themselves to others. Inthe past four years, LabintheWild has attracted approxi-mately 3.5 million participants of diverse ages and educa-tion levels who came from more than 200 countries andregions. Leveraging our experience with LabintheWild, weshow how researchers can design engaging and robustonline experiments that provide personalized feedback. Inparticular, our interactive tutorial highlights challenges andbest-practice guidelines for designing volunteer-based on-line experiments for diverse participant samples.

Author KeywordsOnline experimentation; behavioral experiments; tutorials

ACM Classification KeywordsH.5.m [Information interfaces and presentation (e.g., HCI)]:Miscellaneous

IntroductionResearchers in HCI and various social science disciplineshave been increasingly turning to online platforms, such asAmazon’s Mechanical Turk, to conduct behavioral experi-ments with human subjects. Compared to traditional labo-ratory experiments, online studies offer faster and less trou-

blesome participant recruitment, as well as larger and morediverse samples [6]. However, such financially incentivizedonline studies have limits: Studies requiring effortful, butunverifiable, contributions from workers (e.g., subjective re-sponses on Likert scales) remain difficult to conduct on Me-chanical Turk because some workers, motivated by quickfinancial gain, provide plausible, but untrue, answers [8].Further, Mechanical Turk’s user base is insufficiently di-verse to conduct studies spanning many countries (for ex-ample, the plurality of Turkers come from the United Statesand India [6]). Further still, the subject pool accessible viaonline labor platforms is additionally restricted to those whohave signed up to receive payment in one of two currencies(U.S. dollars and Indian Rupees [6]). This sign-up barriersystematically excludes participants from certain countries,demographics, and personality traits (see, e.g., [3]).

With this demonstration of our online experiment platformLabintheWild [7], we show a complementary approach forconducting large-scale studies: studies with online volun-teers in which participants receive no financial compen-sation but instead benefit from personalized feedback. OnLabintheWild (as on other similar platforms, such as Pro-jectImplicit, TestMyBrain, and OutofService), experimentsenlist participants using short incentive slogans, such as“How fast is your reaction time?”, or “Test your social intelli-gence!” (see Figure 1). After participating in an experiment,volunteers can view a personalized results page that showshow they compare to others.

Figure 1: The LabintheWild frontpage, an example trial page, and apersonalized results page.

Such volunteer-based online experiments have severalstrengths compared to paid online experiments. First, thepersonalized feedback encourages participants to providetruthful answers [2, 7]. Second, existing volunteer-basedplatforms have attracted more diverse participant samplesthan in-lab experiments and those conducted on Mechani-

cal Turk, with participants reporting wider age ranges, morediverse educational backgrounds, and a far more expan-sive geographic dispersion (see, e.g., [2, 7, 5]). Also, whileestimates suggest that a research study (with unlimitedfinancial resources) could access up to 7,300 unique partic-ipants via Mechanical Turk [9], several of our LabintheWildstudies attracted more than 40,000 participants (and twoexceeded 300,000). Despite these strengths, volunteer-based online experiments remain far less prevalent than fi-nancially compensated online experiments. A major reasonis that designing engaging and robust volunteer-based ex-periments for diverse participants requires different knowl-edge and experience compared to financially compensatedexperiments.

The goal of this demonstration is to equip other researcherswith the know-how to design volunteer-based online ex-periments that incentivize participation with personalizedfeedback. Building on more than four years of experiencewith LabintheWild, we developed an interactive experimenttutorial that highlights the most common challenges andbest-practice guidelines. Researchers will be able to usethe tutorial to answer questions, such as:• What are the main differences in design between finan-

cially compensated and uncompensated experiments?

• What kind of experiments are possible?

• How do I effectively incentivize participation?

• How do I design a rewarding experience for participants?

• How can I design experiments for a diversity of partici-pants (including demographics and devices used to ac-cess the studies)?

• How do I ensure data quality in volunteer-based onlineexperiments?

• What is the process for launching my own experiment onLabintheWild?

The LabintheWild PlatformOur volunteer-based online experiment platform LabintheWildwas launched in 2012 with the goal of reaching larger andmore diverse participant samples than what was possiblewith either conventional lab studies or through online labormarkets such as Amazon Mechanical Turk. We made sev-eral key design decisions to achieve this goal:

First, LabintheWild provides unrestricted online access. Theexperiments are conducted without experimenter supervi-sion, which allows participation in large numbers aroundthe clock independent of location and time zones. No signup and account creation is needed for participation. Theexperiments are also open to anyone to participate withoutrequiring people to sign up. This decision lowers the barrierto participation and protects participants’ privacy.

Second, LabintheWild provides personalized feedback in-stead of financial compensation. Paying participants wouldlimit both the size and the diversity of the participant poolbecause of limited financial resources and because onlysome participants have the ability to receive online pay-ments or are interested in earning virtual money. Instead,we leverage the human urge to learn about themselvesand to compare themselves to others [1]: After participat-ing in an experiment, participants are shown a personal-ized results page (see Figure 1), which explains how theydid and how their performance or preferences compareto others. The feedback usually corresponds to our ownresearch questions or to some other aspect of the exper-iment that might be interesting to participants. For eachexperiment, we produce a short slogan that advertises thetype of feedback participants will receive (e.g., “Are youmore Eastern or Western?”). We use these slogans to ad-vertise each experiment on LabintheWild’s front page andto share the experiments via social media. The personal-

ized feedback serves three main purposes. First, it encour-ages participants to take part in experiments because itenables self-reflection and social comparison. Second, itensures data quality: Participants are intrinsically motivatedto provide honest answers and exert themselves. As a re-sult, experiments conducted on LabintheWild and othervolunteer-based experiment platforms produce reliabledata that matches the quality of in-lab studies [2, 4, 7] (in-cluding studies that require effortful subjective judgement,which pose a challenge on Mechanical Turk [8]). Third,the personalized feedback serves as a word-of-mouth re-cruitment tool. Participants share their results on socialnetworking sites, blogs, or other web pages, which gener-ates a self-perpetuating recruitment cycle [7]. For example,most LabintheWild participants come to the site from on-line social networks such as Facebook and Twitter. Sincelaunching four years ago, LabintheWild has been visitedmore than 3.5 million times, with an average of over 1,000participants per day completing an experiment.

Third, we design our experiments to take 5–15 minutes toensure that participation does not become too tedious orexhausting. This also imposes constraints on the experi-ment design. Studies that require showing large numbers ofstimuli are shortened by presenting participants with differ-ent randomized subsamples of the full set of stimuli, and weaccount for the resulting differences in sample frequenciesin the analyses.

These design decisions have resulted in a continuous streamof traffic and more diverse participant samples than havebeen reported for laboratory experiments or those con-ducted on Mechanical Turk (see [7] for additional details).

Tutorial for Volunteer-Based Online ExperimentsTo enable other researchers to conduct studies with diverseparticipants, we developed an interactive tutorial for de-signing engaging and robust volunteer-based online ex-periments (see Figure 2). The tutorial builds on more thanfour years of experience with LabintheWild, several tens ofthousands of comments from participants, as well as on theresults of various meta-studies conducted on the platformand its experiments. The web-based tutorial is implementedas a slide show which first provides a high-level overview ofthe general concept of volunteer-based online experiments(Figure 2.2) before delving into three case study examplesof successful LabintheWild experiments (Figure 2.3), in-cluding the research question that led to the implementa-tion of the experiment and the experiment outcome. Whileresearchers are able to try the experiments through theeyes of participants, they can click and expand speech bub-bles that explain specific design choices (see Figure 2.4).The tutorial ends with a summary of key design choicesand a step-by-step guide on the requirements and processfor launching an experiment on LabintheWild. During thedemonstration at the conference, researchers will be able toexplore LabintheWild as well as our interactive tutorial on alarge touch-screen display and on several iPads, as well ason their own devices.

1

2

3

4

Figure 2: Our interactive onlinetutorial includes generalizableinformation about developingvolunteer-based onlineexperiments as well as three casestudies of successful LabintheWildexperiments.

Relevance and ContributionVolunteer-based online experiments offer a promising, com-plementary research methodology to financially compen-sated online experiments that is currently underutilized inthe research community. With this live demonstration ofLabintheWild, its experiments, and our interactive tutorial,we aim to provide researchers with the knowledge and toolsfor designing and launching engaging and robust onlineexperiments on LabintheWild or other volunteer-based ex-periment platforms.

REFERENCES1. L Festinger. 1954. A theory of social comparison processes.

Human relations 7(2) (1954), 117–140.

2. Laura Germine, Ken Nakayama, Bradley C Duchaine,Christopher F Chabris, Garga Chatterjee, and Jeremy BWilmer. 2012. Is the Web as good as the lab? Comparableperformance from Web and lab in cognitive/perceptualexperiments. Psychonomic bulletin & review 19, 5 (2012),847–857.

3. Joseph K Goodman, Cynthia E Cryder, and Amar Cheema.2013. Data collection in a flat world: The strengths andweaknesses of Mechanical Turk samples. Journal ofBehavioral Decision Making 26, 3 (2013), 213–224.

4. Samuel D Gosling, Simine Vazire, Sanjay Srivastava, andOliver P John. 2004. Should we trust web-based studies? Acomparative analysis of six preconceptions about Internetquestionnaires. American Psychologist 59, 2 (2004), 93–104.

5. Joshua K Hartshorne and Laura T Germine. 2015. Whendoes cognitive functioning peak? The asynchronous rise andfall of different cognitive abilities across the life span.Psychological science (2015), 0956797614567339.

6. Winter Mason and Siddharth Suri. 2012. Conductingbehavioral research on Amazon’s Mechanical Turk. Behaviorresearch methods 44, 1 (2012), 1–23.

7. Katharina Reinecke and Krzysztof Z. Gajos. 2015.LabintheWild: Conducting Large-Scale Online ExperimentsWith Uncompensated Samples. In Proceedings of the 18thACM Conference on Computer Supported Cooperative Work& Social Computing (CSCW ’15). ACM, 1364–1378.

8. Aaron D. Shaw, John J. Horton, and Daniel L. Chen. InProceedings of the ACM 2011 conference on Computersupported cooperative work. New York, NY, USA.

9. Neil Stewart, Christoph Ungemach, Adam JL Harris, Daniel MBartels, Ben R Newell, Gabriele Paolacci, and JesseChandler. 2015. The average laboratory samples a populationof 7,300 Amazon Mechanical Turk workers. Judgment andDecision Making 10, 5 (2015), 479.

LabintheWild: How to Design Uncompensated, Feedback-driven ...

Documents

Transcript of LabintheWild: How to Design Uncompensated, Feedback-driven ...