Mechanical Cheat

Click here to load reader

download Mechanical Cheat

of 13

  • date post

    10-May-2015
  • Category

    Technology

  • view

    142
  • download

    2

Embed Size (px)

Transcript of Mechanical Cheat

  • 1.Mechanical cheat Spamming Schemes and Adversarial Techniques on Crowdsourcing Platforms Djellel Eddine Difallah, GianlucaDemartini, and Philippe Cudr-Mauroux University of Fribourg, Switzerland

2. Popularity and Monetary Incentives Micro task Crowdsourcing is growing in popularity. ~500k registered workers in AMT ~200k hits available (April 2012) ~20k $ of rewards (April 2012) 3. Spam could be a threat for Crowdsourcing 4. Some Experiments Results: Entity Link Selection (ZenCrowd WWW2012) Evidence of participations of dishonest workers, spendingless time doing more tasks and achieving lesser quality. 5. Dishonest Answers onCrowdsourcing Platforms We define a dishonest answer in a crowd sourcing context asanswer that has been either: Randomly posted. Artificially generated. Duplicated from another source. 6. How can requesters perform quality control? Go over all the submissions? Blindly accept all submissions? Use selection and filtering algorithms. 7. Anti adversarial techniques Pre-selection and dissuasion Use built in control (ex: acceptance rate) Task design Qualification test Post processing Task repetition and aggregation Test questions Machine learning (ex: probabilistic netw0rk in ZenCrowd) 8. Countering adversarial techniques Organization 9. Counteringadversarial techniques Individual attacks Random Answers Target tasks designed with monetary incentive Countered with test questions Automated Answers Target tasks with simple submission mechanism Counter with test questions (especially captchas) Semi-Automated Answers Target easy hits achievable with some AI. Can pass easy-to-answer test questions Can detect captchas and forward them to a human. 10. Counteringadversarial techniques Group attacks Agree on Answers Target nave aggregation schemes like majority vote. May discard valid answers! Counter by shuffling the options Answer Sharing Target repeated tasks Counter with creating multiple batches Artificial Clones Target repeated tasks 11. Conclusions and future work We claim the inefficiency of some quality control tools tocounter resourceful spammers. Combine multiple techniques for post-filtering. Crowdsourcing platforms to provide more tools. Evaluation of futurefiltering algorithms must be repeatableand generic. Crowdsourcing benchmark. 12. Conclusions and future work Benchmarkproposal A collection of tasks with multiple choice options Each task is repeated multiple times Unpublished expert judgment for all the tasks Publish answers completed in a controlled environment with thefollowing categories of workers: Honest workers Random clicks Semi automated program Organized group Post-filtering methods are evaluated based on their ability to achievehigh precision score. Other parameter could be the money spent etc 13. Discussion Q&A