Search Accuracy Metrics and Predictive Analytics - A Big Data Use Case: Presented Paul Nelson,...

Post on 07-Jan-2017

723 views 1 download

Transcript of Search Accuracy Metrics and Predictive Analytics - A Big Data Use Case: Presented Paul Nelson,...

O C T O B E R 1 3 - 1 6 , 2 0 1 6 • A U S T I N , T X

Search Accuracy Metrics & Predictive Analytics A Big Data Use Case

Paul Nelson Chief Architect, Search Technologies

pnelson@searchtechnologies.com

3

There will be a demo (so don’t go away)

4

185+  Consultants  Worldwide  

San  Diego  

London,  UK  

San  Jose,  CR  

Cincinna>  

Prague,  CZ  

Washington  (HQ)  

Frankfurt,  DE  

• Founded 2005 • Deep search expertise

• 700+ customers worldwide • Consistent profitability

• Search engines & Big Data • Vendor independent

5

Typical Conversation with Customer

Our searchaccuracyis bad

How bad?Really,really,bad.

Uh… on ascale of 1 to 10,

how bad?

An eight.No wait…

a nine.Maybe even

a 9.5.Let’s call it

a 9.23

6

Current methods are woefully inadequate

•  Golden Query Set o  Key Documents

•  Top 100 / Top 1000 Queries Analysis

•  Zero result queries

•  Abandonment rate

•  Queries with click

•  Conversion

7

What are we trying to achieve? •  Reliable metrics for search accuracy •  Can run analysis off-line

o  Does not require production deployment (!)

•  Can accurately compare two engines •  Runs quickly = agility = high quality •  Can handle different user types / personalization

o  Broad coverage

•  Provides lots of data to analyze what’s going on o  Data to decide how best to improve the engine

Search  Engine  Under  Evalua1on  

Search  Engine  Under  Evalua1on  

Search  Engine  Under  Evalua1on  

8

Leverage logs for accuracy testing

Query  Logs  

Click  Logs  

Big  Data  Framework  

• Engine  Score(s)  • Other  metrics  &  histograms  • Scoring  database  

Search  Engine  Under  Evalua1on  

9

From Queries à Users

•  User by User Metrics o  Change in focus

•  Group activity by session and/or user o  Call this an “Activity Set” o  Merge sessions and users

•  Use Big Data to analyze all users o  There are no stupid queries and no stupid users o  Overall performance based on the experience of the users

Queries  

Other  Ac>vity  

Clicks  

Clusters  

User  

10

Engine Score •  Group activity by session and/or user (Queries & Clicks) •  Determine “relevant” documents

o  What did the user view? Add to cart? Purchase? o  Did the search engine return what the user ultimately wanted?

•  Determine engine score per query based on user’s POV o  Σ power(FACTOR, position)*isRelevant[user, searchResult[position].DocID] o  (Note: many other formulae possible, MRR, MAP, DCG, etc.)

•  Average score for all user queries = user score

•  Average scores across all users = final engine score

11

The FACTOR (K)

12

Off-Line Engine Analysis

o  Can we re-compute this array for all queries? o  ANSWER: Yes!

Σ power(FACTOR, position)*isRelevant[User, searchResult[position].DocID]

Offline  Re-­‐Query  

Search  Engine  Query  Logs   New  

Results  

Big  Data  Array   Search  Engine  (possibly  embedded)  

13

Continuous Improvement Cycle

Modify  Engine  

Execute  Queries  

Compute  Engine  Score  

Evaluate  Results  

Log  Files  

Search  Engine  

Search

Score  Per  Engine  Version  

14

Watch the Score Improve Over Time

15

What else can we do with Engine Scoring?

Predictive Analytics

16

The Brutal Truth about Search Engine Scores

•  Random ad-hoc formulae put together o  No statistical or mathematical foundation

•  TF / IDF à All kinds of inappropriate biases o  Bias towards document size (smaller / larger) o  Bias towards rare (misspelled? archaic?) words o  Not scalable (different scores on different shards)

•  Same formula since the 1970’s

They  are  not  based  on  science.  

We  can  do  beKer!  

 Big  Data  Cluster  

17

We use Big Data to Predict Relevancy Search  Engine  Content  

Sources  

Connectors Index Search  Index  

Search Project  Docs  

Web  Site  Pages  

Support  Pages  

Landing  Pages  

Content Processing

Content  Copy   Search  Click  Logs  Click  Logs  

Query  Logs  

Financial  Data  

Business  Data  

Query  Logs  

Op

RelevancyModel

18

Probability Scoring / Predictive Relevancy

clicked?

purchased?

0 01 11 00 01 01 1

Predic1ve  Analy1cs   Sta1s1cal  Model  to  Predict  Probability  

Product  Signals  

Query  Signals  

User  Signals  

Comparison  Signals  

19

The Power of the Probability Score •  The score predicts probability of relevancy •  Value is 0 à 1

o  Can be used for threshold processing o  All documents too weak? Try something else! o  Can combine results from different sources / constructions together

•  Identifies what’s important o  Machine learning optimizes for parameters

-­‐  Identifies the impact and contribution of every parameter o  If a parameter does not improve relevancy à REMOVE IT o  Scoring becomes objective, not subjective (now based on SCIENCE) o  Allows for experimentation on parameters

20

And now the demo! (just like I promised)

Come out of the darkness

And into the Light!

The Age of Enlightenment for search engine accuracy

is upon us!

Search Accuracy Metrics & Predictive Analytics A Big Data Use Case

Paul Nelson Chief Architect, Search Technologies

pnelson@searchtechnologies.com

Thank you!