Mechanical Cheat

13
Mechanical cheat Spamming Schemes and Adversarial Techniques on Crowdsourcing Platforms Djellel Eddine Difallah, Gianluca Demartini, and Philippe Cudré-Mauroux University of Fribourg, Switzerland

Transcript of Mechanical Cheat

Page 1: Mechanical Cheat

Mechanical cheatSpamming Schemes and Adversarial Techniques on Crowdsourcing PlatformsDjellel Eddine Difallah, Gianluca Demartini, and Philippe Cudré-

MaurouxUniversity of Fribourg, Switzerland

Page 2: Mechanical Cheat

Popularity and Monetary Incentives

Micro task Crowdsourcing is growing in popularity. ~500k registered workers in AMT ~200k hits available (April 2012) ~20k $ of rewards (April 2012)

Page 3: Mechanical Cheat

Spam could be a threat for Crowdsourcing

Page 4: Mechanical Cheat

Some Experiments Results:Entity Link Selection (ZenCrowd – WWW2012)

Evidence of participations of dishonest workers, spending less time doing more tasks and achieving lesser quality.

Page 5: Mechanical Cheat

Dishonest Answers on Crowdsourcing Platforms

We define a dishonest answer in a crowd sourcing context as answer that has been either: Randomly posted. Artificially generated. Duplicated from another source.

Page 6: Mechanical Cheat

How can requesters perform quality control?

Go over all the submissions?

Blindly accept all submissions?

Use selection and filtering algorithms.

Page 7: Mechanical Cheat

Anti adversarial techniques

Pre-selection and dissuasion Use built in control (ex: acceptance rate) Task design Qualification test

Post processing Task repetition and aggregation Test questions Machine learning (ex: probabilistic netw0rk in

ZenCrowd)

Page 8: Mechanical Cheat

Countering adversarial techniquesOrganization

Page 9: Mechanical Cheat

Countering adversarial techniquesIndividual attacks

Random Answers Target tasks designed with monetary incentive Countered with test questions

Automated Answers Target tasks with simple submission mechanism Counter with test questions (especially captchas)

Semi-Automated Answers Target easy hits achievable with some AI. Can pass easy-to-answer test questions Can detect captchas and forward them to a human.

Page 10: Mechanical Cheat

Countering adversarial techniquesGroup attacks

Agree on Answers Target naïve aggregation schemes like majority vote. May discard valid answers! Counter by shuffling the options

Answer Sharing Target repeated tasks Counter with creating multiple batches

Artificial Clones Target repeated tasks

Page 11: Mechanical Cheat

Conclusions and future work

We claim the inefficiency of some quality control tools to counter resourceful spammers.

Combine multiple techniques for post-filtering.

Crowdsourcing platforms to provide more tools.

Evaluation of future filtering algorithms must be repeatable and generic. Crowdsourcing benchmark.

Page 12: Mechanical Cheat

Conclusions and future workBenchmark proposal

A collection of tasks with multiple choice options

Each task is repeated multiple times

Unpublished expert judgment for all the tasks

Publish answers completed in a controlled environment with the following categories of workers: Honest workers Random clicks Semi automated program Organized group

Post-filtering methods are evaluated based on their ability to achieve high precision score. Other parameter could be the money spent etc

Page 13: Mechanical Cheat

DiscussionQ&A