Generating Exact- and Ranked Partially-Matched A nswers to Questions in Advertisements

PowerPoint Presentation

Generating Exact- and Ranked Partially-Matched Answers to Questions in AdvertisementsR. Qumsiyeh, M.S. Pera, and Y.-K. NgIntroductionSearching tools on existing ads websites2Form-based interfaceSimple keyword searchProblems encountered: cannot handle natural language questions with rich syntactic & semantic content, e.g.,Ion cars, cheaper than 6000 dollars, located in New York. If possible I want one with a sunroof (a Facebook user question)

Current Search Tools

A Proposed SolutionA Close-domain Question Answering (QA) system on ads (CQAds)Problems, not addressed by existing ads searching tools, handled by CQAds3Our QA System

Natural language questionsAmbiguous/incomplete ads questionsShorthand notations/spelling mistakes in questions

AND

Blue AND YellowImplicit/explicit Boolean operators in questions

benot

Ranking answers that partially match the users info. needs in questions3CQAdsProcesses a users question Q by identifying Qs ad domain using a Nave Bayes classifierExample. Consider Q: Metallic blue BMW 721 with alloy wheels and a manual gear (a Facebook car ad question)4Classifying Ads DomainsSet of ads domainsProbability of d being the domain of Q(Based on the Joint Beta-Binomial Sampling Model)An ad domainCQAdsEliminates spelling mistakes using a trie data structureExample. Honda accorr less than $2,000 Honda Accord less than $2,000Example. Hondaacord less than $2,000 Honda Accord less than $2,000 Handles shorthand notations using a simple scriptExample. Cheap 4 dr Lexus, not above 75,000 mi Cheap 4 doors Lexus, not above 75,000 miles5Spelling Errors/Shorthand NotationsCQAdsInterprets a users information need specified in Q by Tagging keywords in Q with their types (using a trie), e.g.,

Applies context switching to identify/merge selection criteria using proximity keywords 6Identifying Users Info. NeedsRed Ferrari under $10,000Cheapest Avenger that has less than 25,000 milesBoundary(Range values)Type I(Unique identifier of the item showcased in an ad A)Type II(Descriptive property of the item showcased in A)Type III(Quantitative property of the item showcased in A)Superlative(Max/Min values)CQAdsProcesses any incomplete/ambiguous question Q based on the valid ranges of attributes in QExamples.

7Handling Incomplete QuestionsH2 Hummer, yellow, 24 inch wheels, 4 wheel drive below 2000Within the pre-defined ranges ofYear, Price, and MileageFord F-150, 4 door less than 7500 Within the pre-defined ranges ofPrice and MileageCQAdsThe evaluation stepsConsiders the primary-indexed field in a relation schemaEvaluates secondary-indexed fieldsAnalyzes boundaries on Type III attributes valuesEvaluates superlativesExample.

8Processing Non-Boolean Questions1. Type I attribute (A primary-indexed field)Cheapest red Honda less than 2,000 miles2. Type II attribute (Apply to ads retrieved in Step 1)3. Boundaries on Type III(Apply to ads obtained in Step 2)4. Superlatives(Apply to ads identified in Step 3)CQAdsThe evaluation processComplements the negated quantifiers

Combines quantifiers on the same attribute

Handles mutually exclusive attributes

9Processing Implicit Boolean Questions not less than $1500 more than $1500more than $2000, less than $8000 between $2000 and $8000Toyota, Honda 2 door(Toyota OR Honda) 2 doorCQAdsA question Q is processed as is, if Q consists ofSequences of attribute values separated only by ANDs (ORs, respectively), e.g., blue OR red OR green carOtherwise Excludes all the Boolean operators from QEvaluates Q using inferred Boolean operators Example. Find Toyota and Honda cars with 2 doors (Toyota OR Honda) with 2 doors

10Processing Explicit Boolean QuestionsCQAdsPerforms exact-match on N 1 selection criteria on QCalculates the (normalized) degree of similarity of the remaining condition in Q (based on its attribute type) against the ads in the DBExample.

11Performing Partial Matching$2000 white Honda Accord $1500 ?blue ?Toyota Camry ?Experimental ResultsDataset12Survey dataBenchmarkDataset

SurveyUsers assessments through Facebook to evaluate CQAds

Ads Sources

650 ads questions on 8 different ads domains that cover users basic needsPerformance Evaluation12Experimental Results13Ads Domain ClassificationClassification accuracy of CQAds on assigning 650 ads questions to their corresponding domainsExperimental Results14Boolean Questions InterpretationAccuracy of CQAds on interpreting the information needs on explicit/implicit Boolean questionsBased on 10 randomly-chosen sample questions (on diverse domains) and 182 responses offered by Facebook usersShow me Black Silver carsBlack Mustang with gps, exclude 2 wheeldrive, or a yellow corvette without a gpsExperimental Results15Exact-Matched AnswersEffectiveness of CQAds on retrieving answers exactly matching users specifications in 650 ads questionsDetermined by Facebook users

Experimental Results16Partial Matching & Ranking Precision@K & Mean Reciprocal Rank (MRR) scores achieved by CQAds and other ranking approaches

Determined by 866 responses provided by Facebook users on the top-5 partially-matched answers to each of the 40 sample (Non-)Boolean questionsExperimental Results17Question Processing TimeEfficiency of CQAds and other ranking approachesBased on the average time required to generate answers to each of the 650 ads questions gathered though FacebookConclusions18CQAds

Processes natural language questions on ads

Handles incomplete/ambiguous ad questions

Corrects spelling mistakes & detects shorthand notationsTo be or not to beObjectiveUniqueness18Conclusions19CQAdsANDCondo, AptHandles (explicit/implicit) Boolean questions

Determines the evaluation order of selection criteria in questions using an elegant approachRetrieves exact/partial-matching answers using word-correlation factors, domain-specific matrices, and a novel similarity formulaBlue Toyota?Red FerrariBlue HondaBlue Toyota

Uniqueness

Outperforms existing ranking approaches

CQAdsMore powerful than the search tools of existing ads websitesMeritValidation

Highly effective in answering (non- Boolean) ads natural language questions

CQAds

CQAds

19Questions20

Related WorkClosed domains QA systems rely onOntologies [Chung 04, Wang 10, Vargas-Vera 10] Pre-defined taxonomies & natural language processing [Terol 07]Semantically well-formed sentences available online [Wang 09]Ranking approaches often based on Scoring functions [Manning 08]User-feedback measures [Bilotti 07, Kiebling 02]Existing Frequently Asked Questions (FAQ) [Burke 97]21CQAds

Generating Exact- and Ranked Partially-Matched A nswers to Questions in Advertisements

Documents

Transcript of Generating Exact- and Ranked Partially-Matched A nswers to Questions in Advertisements