ALFRED demo - www2013

download ALFRED demo - www2013

of 16

  • date post

    21-May-2015
  • Category

    Education

  • view

    655
  • download

    1

Embed Size (px)

description

ALFRED: Crowd Assisted Data Extraction

Transcript of ALFRED demo - www2013

  • 1. ALFRED: Crowd AssistedData ExtractionValter Crescenzi, Paolo Merialdo, Disheng QiuDipartimento di IngegneriaUniversit degli Studi Roma TreVia della Vasca Navale, 79, Romedisheng@dia.uniroma3.it

2. Extracting data2M pages from IMDB, and we want to extract ... titles, directors etc ....1/7 3. Extracting data2M pages from IMDB, and we want to extract ... titles, directors etc ....DB#Wrapper!1/7 4. Extracting data2M pages from IMDB, and we want to extract ... titles, directors etc ....Inferencealgorithm!DB#Wrapper!1/7 5. Extracting data2M pages from IMDB, and we want to extract ... titles, directors etc ....Inferencealgorithm!DB#Wrapper!1/7 6. Extracting data2M pages from IMDB, and we want to extract ... titles, directors etc ....Inferencealgorithm!DB#Wrapper!1/7 7. Scaling Wrapper InferenceScaling the number of workers with Crowdsourcing platforms opens newchallenges:Issues: Contributions:2/7 8. Scaling Wrapper InferenceScaling the number of workers with Crowdsourcing platforms opens newchallenges:Issues: Contributions:Non-expertworkers Simple interactions to reduce theworker error rate Membership Query (yes/no answer)2/7 9. Scaling Wrapper InferenceScaling the number of workers with Crowdsourcing platforms opens newchallenges:Issues: Contributions:Non-expertworkers Simple interactions to reduce theworker error rate Membership Query (yes/no answer) Active Learning to carefully selectqueriesCosts2/7 10. Scaling Wrapper InferenceScaling the number of workers with Crowdsourcing platforms opens newchallenges:Issues: Contributions:Non-expertworkers Simple interactions to reduce theworker error rate Membership Query (yes/no answer) Active Learning to carefully selectqueriesCosts2/7Quality Bayesian Model to evaluate theexpected wrapper quality Sampling algorithms Tolerant to inaccurate workers 11. ArchitectureALFRED is a wrapper inference system supervised by workers from acrowdsourcing platform.*Research Track: A Framework for Learning Web Wrappers from the Crowd WWW 2013 3/7 12. Input and Rules Generation4/7 13. Sample Set and Extracted Values5/7 14. Sample Set and Extracted Valuespage0 page1 page2r1r2r3Inception City of God OblivionInception City of God nullInception null Oblivion6/7 15. Sample Set and Extracted Valuespage0 page1 page2r1r2r3Inception City of God OblivionInception City of God nullInception null Oblivion6/7 16. Probability and Noisy7/7