Customizing Ranking Models for Enterprise Search: Presented by Ammar Haris & Joe Zeimen, Salesforce

93
OCTOBER 11-14, 2016 BOSTON, MA

Transcript of Customizing Ranking Models for Enterprise Search: Presented by Ammar Haris & Joe Zeimen, Salesforce

OCTOBER 11-14, 2016 • BOSTON, MA

Customizing Ranking Models for Enterprise SearchAmmar Haris Joe Zeimen

Lead Software Engineer, Salesforce Senior Software Engineer, Salesforce

Forward-Looking StatementsStatement under the Private Securities Litigation Reform Act of 1995:

This presentation may contain forward-looking statements that involve risks, uncertainties, and assumptions. If any such uncertainties materialize or if any of the assumptions proves incorrect, the results of salesforce.com, inc. could differ materially from the results expressed or implied by the forward-looking statements we make. All statements other than statements of historical fact could be deemed forward-looking, including any projections of product or service availability, subscriber growth, earnings, revenues, or other financial items and any statements regarding strategies or plans of management for future operations, statements of belief, any statements concerning new, planned, or upgraded services or technology developments and customer contracts or use of our services.

The risks and uncertainties referred to above include – but are not limited to – risks associated with developing and delivering new functionality for our service, new products and services, our new business model, our past operating losses, possible fluctuations in our operating results and rate of growth, interruptions or delays in our Web hosting, breach of our security measures, the outcome of any litigation, risks associated with completed and any possible mergers and acquisitions, the immature market in which we operate, our relatively limited operating history, our ability to expand, retain, and motivate our employees and manage our growth, new releases of our service and successful customer deployment, our limited history reselling non-salesforce.com products, and utilization and selling to larger enterprise customers. Further information on potential factors that could affect the financial results of salesforce.com, inc. is included in our annual report on Form 10-K for the most recent fiscal year and in our quarterly report on Form 10-Q for the most recent fiscal quarter. These documents and others containing important disclosures are available on the SEC Filings section of the Investor Information section of our Web site.

Any unreleased services or features referenced in this or other presentations, press releases or public statements are not currently available and may not be delivered on time or at all. Customers who purchase our services should make the purchase decisions based upon features that are currently available. Salesforce.com, inc. assumes no obligation and does not intend to update these forward-looking statements.

Outline

● Overview of Search @ Salesforce

● Relevance for Enterprise Search

● Executing custom machine-learned models in Solr

○ Using Function Queries

○ Leveraging SearchComponent

Structured Data

Unstructured Social DataFeeds

Articles

Search @ Salesforce

● Most used feature of Salesforce

● 450 Billion documents

● 90 Million queries per day

● Multiple entry points

○ Web/Mobile

○ Salesforce Object Search Language (SOSL) API

How We Index 450 Billion Documents @ Salesforce

App Tier

Message Queue

Cron Job

Solr Tier

Solr Cores

5. EnqueueLast unindexed

Entity Table

3. TriggerIndex Metadata

1. Create/Update

2. SQL

4. Polling

6. Lookup

Core

7. Fetch Data8. Send for Indexing

Querying @ Salesforce (90 Million per day)

Query Front End

Index Metadata

QueryingService Database

Solr Tier

Solr Cores

Querying @ Salesforce (90 Million per day)

Query Front End

Index Metadata

QueryingService Database

Solr Tier

Solr Cores

Querying @ Salesforce (90 Million per day)

Query Front End

Records

QueryingService Database

Solr Tier

Solr Cores

Access Checks

Relevance for Enterprise Search

● A single search engine needs to cater for multiple customers\organizations.

● Many different types of structured and unstructured documents may exist for a single organization

● Ranking Models - one size fit all may not work very well across different organizations\document types.

Challenges:

Relevance @ Salesforce3 Tier Search Relevance

● L0 - Preliminary ranking of documents matched against a given search query based on similarity score.

● L1 - Additional document and user level attributes used to further refine the ranked documents

● L2 - Final level of document aggregation and re-ranking\sorting

Relevance @ Salesforce3 Tier Search Relevance

● L0 - Solr Level Relevance (Primarily based on TF-IDF and some field level boosts). Does not have access to query independent document level features

● L1 - Application Level Relevance - Static Rank\Query Independent document scoring and re-ranking on top 250 documents only due to performance constraints.

● L2 - Database Level Relevance; re-ranking of top 25 documents based on features available only during final DB query for user access checks.

Moving Search Relevance to Solr

Intent● Have all the 3 tiers of relevance co-hosted and abstracted out of the application tier.

Motivation● Have the static rank applied to a wider set of documents versus a limited set of documents

● Creates the ability to run more complex models

● Provides additional flexibility to the multi-layered machine learning ranking framework.

Relevance @ Salesforce - Original Architecture

Search ServerApp ServerDB Index

L0 Ranker(tf, idf, coefs, field boost)L1 Ranker

(features: popularity, inbound links)

L2 Ranker(Result aggregation + re-ranking)

Config(coefs) Query, coefs

Id, score

Query

Id, score

Relevance @ Salesforce - Original Architecture

Search ServerApp ServerDB Index

L0 Ranker(tf, idf, coefs, field boosts)

L1 Ranker(features, freshness, popularity))

L2 Ranker(Result aggregation)

Config(coefs) Query, coefs

Id, score

Query

Id, score

ID, score,

features

L2 Ranker

The document aggregator lives in the app.

In the application tier, results from multiple solr cores are merged together

Normalize the scores over the maximum score and re-rank cross core documents based on the final solr score

01

0

12

Basic tf-idf Similarity Score● Leverages relevance related features provided by solr out of the box:

○ Boost specific fields of the document, if matched.■ Title\Name field, document owner id field

○ De-boost documents on specific fields■ Is record inactive

○ Use function query to apply custom linear functions on select features.■ product(8.429,floor(div(max(0,log(floor(product(pow(0.98,div(sub(ms(),feature.pageViewsLast

Updated),84600000)),feature.pageViews)))),log(2.718))))

L0 Ranker01

0

12

● Re-ranker - allows for running of more complex\expensive models on a subset of matched documents

● Consumes features from stored fields and docvalue fields● Enables usage of same feature vector across different ranking layers● Features consumed and cached during L0 ranker execution may be used in

the L1 ranker as well● Allows to easily plug in different kind of relevance models (boosted decision

trees, polynomials, etc)

L1 Ranker (DeepRanker)01

0

12

Basic with Mountain Footer

Text boxes default to 20 points and without bullets on the slide and are darker grayTo create second level paragraph change text manually to 18 points and select theme color gray

Use a soft return for creating the next paragraph (shift + enter will limit the spacing size)To create second level paragraph change text manually to 18 points and select theme color gray

24 point subtitle

Model Execution/Deep Ranker (Solr SearchComponent)

Model A

run() Feature Extractor

Model B

run() Feature Extractor

Model C

run() Feature Extractor

DeepRanker extends SearchComponent

ResponseBuilderWith model id in queryparams

Scored Results

Online (Query Time) Feature Extraction24 point subtitle

Once the user query is received and parsed, the feature vector is extracted, which will feed into the model.

Typical ranker features include Tf score, Idf score, phrase match score, document recency, popularity score etc.

Feature extraction is triggered from a custom Solr SearchComponent and loads up all the document and query related features into a feature vector.

Once the features are extracted and loaded into a feature vector, they are passed over to the model executor.

public abstract class Model {

public abstract double run(@NonNull final FeatureContainer features) throws Exception;

public FeatureExtractor getFeatureExtractorForModel() { return new FieldLevelExtractorImpl(false); }}

Model may be customized at a per organization/object type level

After model execution the document ids and their scores are passed to the Query Front End.

Model Execution

Demo

DeepRanker as Solr SearchComponentLast search component to run in the solr search query pipeline

Doc Values vs Stored ValuesStored Values more suited for reading multiple values.Doc Values ideal for storing per document values with support for primitive types (int, long)

Number of documents a model can run on

Feature extraction/Model Execution timeouts

Design Considerations and Limitations

Figure out how to put deep ranker near the beginning of the pipeline to run on the full corpus (integrating the deep ranker models with Similarity class).

Move L2 Ranker (document aggregator) out of the application tier into a separate aggregation service and add social signals to this ranker.

Ingest additional signals in the solr rankers which may not be part of the search index

Future Plans for Deep Ranker

Search at Salesforce

Enterprise search must cater to a variety of use cases and types of data

Deep Ranker Solr offers solutionsEasily and dynamically use different models for different situationsRun on more results than previously possibleUse same features across other ranking layers

Summary

We are Hiring !

ML EngineersEngineering Managers

Software EngineersData Scientists Join Salesforce Search Cloud

Mining Intent @ Work

Dreamforce 2016 AlignedSalesforce Google Slides Template Template, graphic resources and how-tos

This online template was developed for internal Salesforce meetings. It is lower resolution, streamlined and does not merge with PowerPoint well. For offline or external audiences, please use the official Corporate PowerPoint Template as your foundation.

This template is maintained by the Corporate Messaging & Content team.Please send any questions to our Chatter Group.

Make a Copy of this Presentation Before You Start

Copy a version of this template into your Google Drives to begin working in it. This will not effect the master file shown here.

Refer back to this template for the most current version with updated, assets, examples and how-tos.

Google Slides and PowerPoint Are Worlds Apart

This template the Corporate PowerPoint Template were created for different use cases despite looking similar.

Please Don’t Download Google Slides As A PowerPoint Presentation Use the Corporate PowerPoint Template as the basis for your offline presentations.

Best Practice for Adding Slides From One Deck into the Other FormatIf for example, you need a slide from Google Slides placed into a PowerPoint deck (or vice versa) we recommend copying the content itself off the original slide and pasting that into the new slide. Avoid copying over entire slides. (Start by copying titles, then go back and copy images and other content, then delete the source slide.)

Following these rules will ensure the highest quality and prevent problems when displaying, updating or sharing files with others.

Google SlidesInternal facing online presentations only

PowerPoint TemplateExternal facing presentations

Corporate PresentationThis deck is built on the Corporate Template. Use this for external facing presentations. This will be updated after dreamforce.

1. Select the slide you would like to change

How To Change a Slide to a Different Layout in Google Slides

2. Right click on the slide and Select Apply Layout

3. Choose the layout you would like to change it to

Google Slides Template

Available Slide LayoutsThere are over 25 pre made layouts built into this template. Formal and creative options available.

Custom Sample SlidesAdditional custom slides that capture the current look and feel

Graphic Assets Tool KitTools and resources that can be used to add texture, character and retain consistency across your deck

How Tos & ShortcutsSteps to increase speed in production and improve the overall quality

1

2

3

4

Available Slide Layouts

Slide Layout ASubtitle placeholder

First Name Last NameTitle of Presenter

[email protected]@twitterhandle

Title Slide Layout BSubtitle placeholder

First Name Last NameTitle of [email protected]@twitterhandle

Title Slide Layout CSubtitle placeholder

First Name Last NameTitle of Presenter

[email protected]@twitterhandle

Basic

Text boxes default to 20 points and without bullets on the slide and are darker gray

To create second level paragraph change text manually to 18 points and select theme color gray

Use a soft return for creating the next paragraph (shift + enter will limit the spacing size)

To create second level paragraph change text manually to 18 points and select theme color gray

24 point subtitle

Title and SubtitleTitle and subtitle only

Title and Subtitle with Mountain FooterTitle and subtitle only

Two Column TextSubtitle placeholder

Three Column TextSubtitle placeholder

Third Split LayoutSubtitle placeholder

Text, images, charts, tables can be put in this placeholder.

Third Split Layout 2

Text, images, charts, tables can be put in this placeholder.

Subtitle placeholder

Subtitle placeholderProduct Placement Slide

Lorem ipsum dolor sit amet, consectetur

Cras egestas mauris ut faucibus cursus

Pellentesque et risus ac turpis maximus

Crop Your Image To this Space For A Photo slide

Or place a shape the color of your Product Cloud

Subtitle placeholderPhoto Content Layout

Lorem ipsum dolor sit amet, consectetur

Cras egestas mauris ut faucibus cursus

Pellentesque et risus ac turpis maximus

Segue with Mountains

Segue Slide Blue

Segue Slide Gray Subtitle

Speaker SlideTitle, company placeholder

Speaker Slide

Title of Speaker

Logo

Basic Dark LayoutOnly use this layout for important callout slides

Text boxes default to 20 points and without bullets on the slide and are darker grayTo create second level paragraph change text manually to 16 points and select lightest white

Use a soft return for creating the next paragraph (shift + enter will limit the spacing size)To create second level paragraph change text manually to 16 points and select lightest white

QuoteName and title

Crop Lifestyle Image to a Circle

Logo

“The Service Cloud is the front door of the house.”Service Cloud Implementation Lead, Intuit

Intuit Replaces Siebel with Service Cloud to Increase AgilityIntegrated multiple systems into a single sign-on solution

Streamlined paperless end-to-end process

Managing all client relations with Salesforce

Access to full workstation on mobile devices

Industry: High Tech Segment: EBU

“Customer service with Desk.com is out secret recipe.”Conrad Chu, CTO & Co-founder

Munchery Delivers Faster with Desk.comUp and running on Desk.com in one hour

Leveraging customer data to improve business

Central hub for all customer support channels

330 resolved cases per day

Industry: Retail Segment: SMB

Crop Lifestyle Image to this Shape

Logo

Revving up a Startup with a Deluxe Service ExperienceOn-demand parking app with 40% MoM growth

Deployed Desk.com in one day

Integration with Slack, Teckst, and homegrown CRM

Industry: Transportation Segment: MM

40%decrease in first response time

Custom Sample Slides

New Empowerment ModelAnyone can be a Customer Trailblazer

Transform your company Innovate with Salesforce Grow your career Be your best

You can…

salary premium with Salesforce Certification

$20K

Tami LauCRM Developer

Be a Customer TrailblazerConnect to your customers in a whole new way

8 industry leading apps, 1 platform

Tami LauCRM Developer

Analytics CloudGet smarter about your customers

Connect all your customer dataWave Platform

Get answers, fasterSales Wave & Service Wave Apps

Take action, instantlyWave Actions in Salesforce

Make decisions from anywhereWave Mobile

faster decision making48%

Chapter 1 Chapter 2 Chapter 3

Place Image Here

Place Image Here Place Image Here

Chapter 1 Chapter 2 Chapter 3

Place Image Here

Place Image Here Place Image Here

Chapter 1 Chapter 2 Chapter 3

Place Image Here

Place Image Here Place Image Here

Example of a Table

Column title Column title Column title

Row title $00.00 $00.00 $00.00

Row title $00.00 $00.00 $00.00

Row title $00.00 $00.00 $00.00

Row title $00.00 $00.00 $00.00

Row title $00.00 $00.00 $00.00

Example of a Table

Column title Column title Column titleRow title Lorem ipsum dolor sit amet,

consectetur adipiscing elit.

Suspendisse congue turpis maximus dignissim posuere.

Quisque sit amet justo ultrices, finibus massa eu, vehicula dui.

Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Suspendisse congue turpis maximus dignissim posuere.

Quisque sit amet justo ultrices, finibus massa eu, vehicula dui.

Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Suspendisse congue turpis maximus dignissim posuere.

Quisque sit amet justo ultrices, finibus massa eu, vehicula dui.

Row title Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Suspendisse congue turpis maximus dignissim posuere.

Quisque sit amet justo ultrices, finibus massa eu, vehicula dui.

Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Suspendisse congue turpis maximus dignissim posuere.

Quisque sit amet justo ultrices, finibus massa eu, vehicula dui.

Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Suspendisse congue turpis maximus dignissim posuere.

Quisque sit amet justo ultrices, finibus massa eu, vehicula dui.

Example of a Table

Column title Column title Column title

Row title $00.00 $00.00 $00.00

Row title $00.00 $00.00 $00.00

Row title $00.00 $00.00 $00.00

Row title $00.00 $00.00 $00.00

Row title $00.00 $00.00 $00.00

Graphic Resources

Customer Success PlatformAs a navigational breadcrumb tool and graphic

Sales Service Marketing Communities Analytics Apps Commerce IoT

Corporate Colors

Bold Text(Salesforce

Blue)

StandardText/

Subtitles

AnalyticsColor

ServiceColor

MarketingColor

CommunityColor

PowerPoint Background

1

Cloud Colors Built in

#262626ff #d0d9deff #19325cff #00a1e0ff #00b2a9ff #963cbdff #ed8b00ff #ffc72cff#7c868dff

#418fdeff #00b2a9ff#963cbdff #ed8b00ff #ffc72cff #001871ff #001871ff

Sales Cloud Analytics

ColorServiceCloud

MarketingCloud

CommunityCloud

Apps Cloud

IoT Cloud

Theme Colors Built Into the Template

All Cloud Colors

Chart Color Order

Logo Colors

#84bd00

Commerce Cloud

Text Font Size and Colors

Subtitle: 24 PointsHeading 1: 20 PointsHeading 2: 18 pointsHeading 3: 14 points

Slide Graphics

Use these styles consistently throughout so the visuals effectively support your presentation.

Shapes and gradientsIn general, the shapes should be flat and colored with slightly rounded corners.

ArrowsUse above arrow head for visual consistency.

Diagram ArrowsUse this style within movement diagrams: line and circle color can change with diagram use.

Lines

Branded Lines are set at 3 pt thickness.

Standard lines are 1 pt thick

Call Out Box StyleCall out boxes are a great way to highlight a piece of your layout using text. Hold the shift key when resizing to ensure aspect ratios stay the same

Highlight Color

Slide Graphics

Text

Learn more about Lightning!

Visit the Lightning Experience Theater in the Campground

NEW

more time with your customers

25%

more time with your customers

25%

faster decision making

+48%For longer text

25% 25%

Text here

Text here

25%

25%

Design ElementsNature

Housing

Equipment

Animals (Animals should always be smaller than the official Mascots)

Transportation

Lighting

Shadow(Resize and layer as needed)

Official Mascots

iPadAir Vertical 2

iPhone6 Galaxy 7 Apple Watch

iPadAir Horizontal 2

LG3

Devices

Macbook Pro Retina Display(Browser bar is removable)

Thunderbolt Monitor for Desktop(Browser bar is removable)

Dell Laptop

Dell Monitor

Salesforce Corporate and Signature Logo Note that tagline as changed as of February 2016

Salesforce Brand Extension Logos

Additional Logos

Salesforce1 Logos

How Tos & Shortcuts

Best Practice for Importing Slides

If you import slides from an older deck into this Google Slides template the content will not link and align perfectly, even with the appropriate layout selected.

To ensure slides are consistent within the template and all slides use the same spacing and alignment, it is recommended that you create a blank new slide after you copy in a slide from one deck to the other. Once you have these side by side retype or paste the title and subtitle into your final presentation.

Then go back, copy and paste the remaining content and graphics into the slide directly. Once you have recreated the slide in your deck, you can delete the original.

Note: Even if an imported slide looks similar to the template, double check that the titles align. (The only way to ensure that a slide is actually correct is to rebuild the slide starting with a blank layout.)

Five Alternatives to Bulleted ListsSalesforce tries to avoid using bulleted lists when possible

1. Paragraph Line Spacing

Motivate to get things doneInspire by tracking goalsScore to ensure right priority

Motivate to get things done

Inspire by tracking goals

Score to ensure right priority

2. Bold first word (note wording must be carefully constructed)

Motivate to get things done

Inspire by tracking goals

Score to ensure right priority

3. Columns (note wording must be carefully constructed)

Inspire by tracking goals

Score to ensure right priority

Motivate to get things done

4. Paragraph Heading or Word Heading

5. Graphic instead of Bullet

Motivateto get things done

Inspireby tracking goals

Scoreto ensure right priority

Motivate to get things done

Inspire by tracking goals

Score to ensure right priority

How To Apply a Google Theme (Template)to An Existing Presentation

You can apply a Template Theme to an existing deck. Note that it is not a 1 to 1 translation so many things will be off. (Especially subtitles, left margin, font colors, etc. )

We recommend you rebuild a deck into the new template one slide at a time to ensure accuracy.

If you do Import the Theme, you still will need to through each slide to ensure layout is consistent with the Google Slides Corporate Template.

Refer to the steps on right

1

2

3

1. Slide Tab

2. Change Theme

Import theme

Selectoriginal template

Go through each slide carefully.

4

1

2

3

4

4

All Text boxes default to 20 points without bullets on the slide and default to the darker gray

To create second level paragraph, change text manually to 18 points and select theme color gray

Use a soft return for creating the next paragraph (shift + enter will limit the spacing size)

You can press a hard return to create larger spaces between your paragraphs or go to: Line spacing>Custom Spacing and set them manually

All layouts have a place for a subtitle and should always be sentence case Working with Text and Content

● Manually add a bullet by selecting bullet icon on toolbar and pressing return○ Once you have added bullets you can press the indent icon button

for the second level bullet

Working with Bullets on the SlideHow to add bullets to a basic layout

NOTE: If you don’t want both levels to have a bullet, you can delete a bullet by pressing backspace until it is gone.

Step 1

Step 2

Keyboard ShortcutsObjectsMove Object One Pixel: Shift + Arrow Keys (PC: Cntrl + Arrow Keys)Group: Cmnd + G (PC: Cntrl + G)Ungroup: Cmnd + Shift + G (PC: Cntrl + Shift + G)Send Backwards: Cmnd + Bring Forwards: Cmnd + Textbox into Shape: Esc (Once you press escape you can move a text boxs with the arrow keys)

TextSoft return: Shift + EnterBold Text: Cmnd + B (PC: Cntrl + B)Align Left: Cmnd + Shift + L (PC: Cntrl + Shift + L)Center Align: Cmnd + Shift + E (PC: Cntrl + Shift + E)Right Align: Cmnd + Shift + R (PC: Cntrl + Shift + R)Repeat Last Action: Cmnd + Y (PC: Cntrl + Y)Paste Unformatted Text: Cmnd + Shft + V (PC: Cntrl + Shft + V)

SlidesNew Slide: Cntrl + M

Drawing Guides Alignment Tool (Margins)

The left and right top and bottom corners only area you should work

within on your slides.Go into Master Layouts Copy and Paste These Lines into your Working Slides to Create GuidesSince Google slides does not allow you add Drawing Guides, you can go into this Master slide, copy these orange lines, paste them into the slide you are working on, align content and delete.