Customizing Ranking Models for Enterprise Search: Presented by Ammar Haris & Joe Zeimen, Salesforce
-
Upload
lucidworks -
Category
Technology
-
view
192 -
download
0
Transcript of Customizing Ranking Models for Enterprise Search: Presented by Ammar Haris & Joe Zeimen, Salesforce
Customizing Ranking Models for Enterprise SearchAmmar Haris Joe Zeimen
Lead Software Engineer, Salesforce Senior Software Engineer, Salesforce
Forward-Looking StatementsStatement under the Private Securities Litigation Reform Act of 1995:
This presentation may contain forward-looking statements that involve risks, uncertainties, and assumptions. If any such uncertainties materialize or if any of the assumptions proves incorrect, the results of salesforce.com, inc. could differ materially from the results expressed or implied by the forward-looking statements we make. All statements other than statements of historical fact could be deemed forward-looking, including any projections of product or service availability, subscriber growth, earnings, revenues, or other financial items and any statements regarding strategies or plans of management for future operations, statements of belief, any statements concerning new, planned, or upgraded services or technology developments and customer contracts or use of our services.
The risks and uncertainties referred to above include – but are not limited to – risks associated with developing and delivering new functionality for our service, new products and services, our new business model, our past operating losses, possible fluctuations in our operating results and rate of growth, interruptions or delays in our Web hosting, breach of our security measures, the outcome of any litigation, risks associated with completed and any possible mergers and acquisitions, the immature market in which we operate, our relatively limited operating history, our ability to expand, retain, and motivate our employees and manage our growth, new releases of our service and successful customer deployment, our limited history reselling non-salesforce.com products, and utilization and selling to larger enterprise customers. Further information on potential factors that could affect the financial results of salesforce.com, inc. is included in our annual report on Form 10-K for the most recent fiscal year and in our quarterly report on Form 10-Q for the most recent fiscal quarter. These documents and others containing important disclosures are available on the SEC Filings section of the Investor Information section of our Web site.
Any unreleased services or features referenced in this or other presentations, press releases or public statements are not currently available and may not be delivered on time or at all. Customers who purchase our services should make the purchase decisions based upon features that are currently available. Salesforce.com, inc. assumes no obligation and does not intend to update these forward-looking statements.
Outline
● Overview of Search @ Salesforce
● Relevance for Enterprise Search
● Executing custom machine-learned models in Solr
○ Using Function Queries
○ Leveraging SearchComponent
Search @ Salesforce
● Most used feature of Salesforce
● 450 Billion documents
● 90 Million queries per day
● Multiple entry points
○ Web/Mobile
○ Salesforce Object Search Language (SOSL) API
How We Index 450 Billion Documents @ Salesforce
App Tier
Message Queue
Cron Job
Solr Tier
Solr Cores
5. EnqueueLast unindexed
Entity Table
3. TriggerIndex Metadata
1. Create/Update
2. SQL
4. Polling
6. Lookup
Core
7. Fetch Data8. Send for Indexing
Querying @ Salesforce (90 Million per day)
Query Front End
Index Metadata
QueryingService Database
Solr Tier
Solr Cores
Querying @ Salesforce (90 Million per day)
Query Front End
Index Metadata
QueryingService Database
Solr Tier
Solr Cores
Querying @ Salesforce (90 Million per day)
Query Front End
Records
QueryingService Database
Solr Tier
Solr Cores
Access Checks
Relevance for Enterprise Search
● A single search engine needs to cater for multiple customers\organizations.
● Many different types of structured and unstructured documents may exist for a single organization
● Ranking Models - one size fit all may not work very well across different organizations\document types.
Challenges:
Relevance @ Salesforce3 Tier Search Relevance
● L0 - Preliminary ranking of documents matched against a given search query based on similarity score.
● L1 - Additional document and user level attributes used to further refine the ranked documents
● L2 - Final level of document aggregation and re-ranking\sorting
Relevance @ Salesforce3 Tier Search Relevance
● L0 - Solr Level Relevance (Primarily based on TF-IDF and some field level boosts). Does not have access to query independent document level features
● L1 - Application Level Relevance - Static Rank\Query Independent document scoring and re-ranking on top 250 documents only due to performance constraints.
● L2 - Database Level Relevance; re-ranking of top 25 documents based on features available only during final DB query for user access checks.
Moving Search Relevance to Solr
Intent● Have all the 3 tiers of relevance co-hosted and abstracted out of the application tier.
Motivation● Have the static rank applied to a wider set of documents versus a limited set of documents
● Creates the ability to run more complex models
● Provides additional flexibility to the multi-layered machine learning ranking framework.
Relevance @ Salesforce - Original Architecture
Search ServerApp ServerDB Index
L0 Ranker(tf, idf, coefs, field boost)L1 Ranker
(features: popularity, inbound links)
L2 Ranker(Result aggregation + re-ranking)
Config(coefs) Query, coefs
Id, score
Query
Id, score
Relevance @ Salesforce - Original Architecture
Search ServerApp ServerDB Index
L0 Ranker(tf, idf, coefs, field boosts)
L1 Ranker(features, freshness, popularity))
L2 Ranker(Result aggregation)
Config(coefs) Query, coefs
Id, score
Query
Id, score
ID, score,
features
L2 Ranker
The document aggregator lives in the app.
In the application tier, results from multiple solr cores are merged together
Normalize the scores over the maximum score and re-rank cross core documents based on the final solr score
01
0
12
Basic tf-idf Similarity Score● Leverages relevance related features provided by solr out of the box:
○ Boost specific fields of the document, if matched.■ Title\Name field, document owner id field
○ De-boost documents on specific fields■ Is record inactive
○ Use function query to apply custom linear functions on select features.■ product(8.429,floor(div(max(0,log(floor(product(pow(0.98,div(sub(ms(),feature.pageViewsLast
Updated),84600000)),feature.pageViews)))),log(2.718))))
L0 Ranker01
0
12
● Re-ranker - allows for running of more complex\expensive models on a subset of matched documents
● Consumes features from stored fields and docvalue fields● Enables usage of same feature vector across different ranking layers● Features consumed and cached during L0 ranker execution may be used in
the L1 ranker as well● Allows to easily plug in different kind of relevance models (boosted decision
trees, polynomials, etc)
L1 Ranker (DeepRanker)01
0
12
Basic with Mountain Footer
Text boxes default to 20 points and without bullets on the slide and are darker grayTo create second level paragraph change text manually to 18 points and select theme color gray
Use a soft return for creating the next paragraph (shift + enter will limit the spacing size)To create second level paragraph change text manually to 18 points and select theme color gray
24 point subtitle
Model Execution/Deep Ranker (Solr SearchComponent)
Model A
run() Feature Extractor
Model B
run() Feature Extractor
Model C
run() Feature Extractor
DeepRanker extends SearchComponent
ResponseBuilderWith model id in queryparams
Scored Results
Online (Query Time) Feature Extraction24 point subtitle
Once the user query is received and parsed, the feature vector is extracted, which will feed into the model.
Typical ranker features include Tf score, Idf score, phrase match score, document recency, popularity score etc.
Feature extraction is triggered from a custom Solr SearchComponent and loads up all the document and query related features into a feature vector.
Once the features are extracted and loaded into a feature vector, they are passed over to the model executor.
public abstract class Model {
public abstract double run(@NonNull final FeatureContainer features) throws Exception;
public FeatureExtractor getFeatureExtractorForModel() { return new FieldLevelExtractorImpl(false); }}
Model may be customized at a per organization/object type level
After model execution the document ids and their scores are passed to the Query Front End.
Model Execution
DeepRanker as Solr SearchComponentLast search component to run in the solr search query pipeline
Doc Values vs Stored ValuesStored Values more suited for reading multiple values.Doc Values ideal for storing per document values with support for primitive types (int, long)
Number of documents a model can run on
Feature extraction/Model Execution timeouts
Design Considerations and Limitations
Figure out how to put deep ranker near the beginning of the pipeline to run on the full corpus (integrating the deep ranker models with Similarity class).
Move L2 Ranker (document aggregator) out of the application tier into a separate aggregation service and add social signals to this ranker.
Ingest additional signals in the solr rankers which may not be part of the search index
Future Plans for Deep Ranker
Search at Salesforce
Enterprise search must cater to a variety of use cases and types of data
Deep Ranker Solr offers solutionsEasily and dynamically use different models for different situationsRun on more results than previously possibleUse same features across other ranking layers
Summary
We are Hiring !
ML EngineersEngineering Managers
Software EngineersData Scientists Join Salesforce Search Cloud
Mining Intent @ Work
Dreamforce 2016 AlignedSalesforce Google Slides Template Template, graphic resources and how-tos
This online template was developed for internal Salesforce meetings. It is lower resolution, streamlined and does not merge with PowerPoint well. For offline or external audiences, please use the official Corporate PowerPoint Template as your foundation.
This template is maintained by the Corporate Messaging & Content team.Please send any questions to our Chatter Group.
Make a Copy of this Presentation Before You Start
Copy a version of this template into your Google Drives to begin working in it. This will not effect the master file shown here.
Refer back to this template for the most current version with updated, assets, examples and how-tos.
Google Slides and PowerPoint Are Worlds Apart
This template the Corporate PowerPoint Template were created for different use cases despite looking similar.
Please Don’t Download Google Slides As A PowerPoint Presentation Use the Corporate PowerPoint Template as the basis for your offline presentations.
Best Practice for Adding Slides From One Deck into the Other FormatIf for example, you need a slide from Google Slides placed into a PowerPoint deck (or vice versa) we recommend copying the content itself off the original slide and pasting that into the new slide. Avoid copying over entire slides. (Start by copying titles, then go back and copy images and other content, then delete the source slide.)
Following these rules will ensure the highest quality and prevent problems when displaying, updating or sharing files with others.
Google SlidesInternal facing online presentations only
PowerPoint TemplateExternal facing presentations
Corporate PresentationThis deck is built on the Corporate Template. Use this for external facing presentations. This will be updated after dreamforce.
1. Select the slide you would like to change
How To Change a Slide to a Different Layout in Google Slides
2. Right click on the slide and Select Apply Layout
3. Choose the layout you would like to change it to
Google Slides Template
Available Slide LayoutsThere are over 25 pre made layouts built into this template. Formal and creative options available.
Custom Sample SlidesAdditional custom slides that capture the current look and feel
Graphic Assets Tool KitTools and resources that can be used to add texture, character and retain consistency across your deck
How Tos & ShortcutsSteps to increase speed in production and improve the overall quality
1
2
3
4
Slide Layout ASubtitle placeholder
First Name Last NameTitle of Presenter
[email protected]@twitterhandle
Title Slide Layout BSubtitle placeholder
First Name Last NameTitle of [email protected]@twitterhandle
Title Slide Layout CSubtitle placeholder
First Name Last NameTitle of Presenter
[email protected]@twitterhandle
Basic
Text boxes default to 20 points and without bullets on the slide and are darker gray
To create second level paragraph change text manually to 18 points and select theme color gray
Use a soft return for creating the next paragraph (shift + enter will limit the spacing size)
To create second level paragraph change text manually to 18 points and select theme color gray
24 point subtitle
Third Split Layout 2
Text, images, charts, tables can be put in this placeholder.
Subtitle placeholder
Subtitle placeholderProduct Placement Slide
Lorem ipsum dolor sit amet, consectetur
Cras egestas mauris ut faucibus cursus
Pellentesque et risus ac turpis maximus
Crop Your Image To this Space For A Photo slide
Or place a shape the color of your Product Cloud
Subtitle placeholderPhoto Content Layout
Lorem ipsum dolor sit amet, consectetur
Cras egestas mauris ut faucibus cursus
Pellentesque et risus ac turpis maximus
Basic Dark LayoutOnly use this layout for important callout slides
Text boxes default to 20 points and without bullets on the slide and are darker grayTo create second level paragraph change text manually to 16 points and select lightest white
Use a soft return for creating the next paragraph (shift + enter will limit the spacing size)To create second level paragraph change text manually to 16 points and select lightest white
“The Service Cloud is the front door of the house.”Service Cloud Implementation Lead, Intuit
Intuit Replaces Siebel with Service Cloud to Increase AgilityIntegrated multiple systems into a single sign-on solution
Streamlined paperless end-to-end process
Managing all client relations with Salesforce
Access to full workstation on mobile devices
Industry: High Tech Segment: EBU
“Customer service with Desk.com is out secret recipe.”Conrad Chu, CTO & Co-founder
Munchery Delivers Faster with Desk.comUp and running on Desk.com in one hour
Leveraging customer data to improve business
Central hub for all customer support channels
330 resolved cases per day
Industry: Retail Segment: SMB
Revving up a Startup with a Deluxe Service ExperienceOn-demand parking app with 40% MoM growth
Deployed Desk.com in one day
Integration with Slack, Teckst, and homegrown CRM
Industry: Transportation Segment: MM
40%decrease in first response time
New Empowerment ModelAnyone can be a Customer Trailblazer
Transform your company Innovate with Salesforce Grow your career Be your best
You can…
salary premium with Salesforce Certification
$20K
Tami LauCRM Developer
Be a Customer TrailblazerConnect to your customers in a whole new way
8 industry leading apps, 1 platform
Tami LauCRM Developer
Analytics CloudGet smarter about your customers
Connect all your customer dataWave Platform
Get answers, fasterSales Wave & Service Wave Apps
Take action, instantlyWave Actions in Salesforce
Make decisions from anywhereWave Mobile
faster decision making48%
Example of a Table
Column title Column title Column title
Row title $00.00 $00.00 $00.00
Row title $00.00 $00.00 $00.00
Row title $00.00 $00.00 $00.00
Row title $00.00 $00.00 $00.00
Row title $00.00 $00.00 $00.00
Example of a Table
Column title Column title Column titleRow title Lorem ipsum dolor sit amet,
consectetur adipiscing elit.
Suspendisse congue turpis maximus dignissim posuere.
Quisque sit amet justo ultrices, finibus massa eu, vehicula dui.
Lorem ipsum dolor sit amet, consectetur adipiscing elit.
Suspendisse congue turpis maximus dignissim posuere.
Quisque sit amet justo ultrices, finibus massa eu, vehicula dui.
Lorem ipsum dolor sit amet, consectetur adipiscing elit.
Suspendisse congue turpis maximus dignissim posuere.
Quisque sit amet justo ultrices, finibus massa eu, vehicula dui.
Row title Lorem ipsum dolor sit amet, consectetur adipiscing elit.
Suspendisse congue turpis maximus dignissim posuere.
Quisque sit amet justo ultrices, finibus massa eu, vehicula dui.
Lorem ipsum dolor sit amet, consectetur adipiscing elit.
Suspendisse congue turpis maximus dignissim posuere.
Quisque sit amet justo ultrices, finibus massa eu, vehicula dui.
Lorem ipsum dolor sit amet, consectetur adipiscing elit.
Suspendisse congue turpis maximus dignissim posuere.
Quisque sit amet justo ultrices, finibus massa eu, vehicula dui.
Example of a Table
Column title Column title Column title
Row title $00.00 $00.00 $00.00
Row title $00.00 $00.00 $00.00
Row title $00.00 $00.00 $00.00
Row title $00.00 $00.00 $00.00
Row title $00.00 $00.00 $00.00
Customer Success PlatformAs a navigational breadcrumb tool and graphic
Sales Service Marketing Communities Analytics Apps Commerce IoT
Corporate Colors
Bold Text(Salesforce
Blue)
StandardText/
Subtitles
AnalyticsColor
ServiceColor
MarketingColor
CommunityColor
PowerPoint Background
1
Cloud Colors Built in
#262626ff #d0d9deff #19325cff #00a1e0ff #00b2a9ff #963cbdff #ed8b00ff #ffc72cff#7c868dff
#418fdeff #00b2a9ff#963cbdff #ed8b00ff #ffc72cff #001871ff #001871ff
Sales Cloud Analytics
ColorServiceCloud
MarketingCloud
CommunityCloud
Apps Cloud
IoT Cloud
Theme Colors Built Into the Template
All Cloud Colors
Chart Color Order
Logo Colors
#84bd00
Commerce Cloud
Text Font Size and Colors
Subtitle: 24 PointsHeading 1: 20 PointsHeading 2: 18 pointsHeading 3: 14 points
Slide Graphics
Use these styles consistently throughout so the visuals effectively support your presentation.
Shapes and gradientsIn general, the shapes should be flat and colored with slightly rounded corners.
ArrowsUse above arrow head for visual consistency.
Diagram ArrowsUse this style within movement diagrams: line and circle color can change with diagram use.
Lines
Branded Lines are set at 3 pt thickness.
Standard lines are 1 pt thick
Call Out Box StyleCall out boxes are a great way to highlight a piece of your layout using text. Hold the shift key when resizing to ensure aspect ratios stay the same
Highlight Color
Slide Graphics
Text
Learn more about Lightning!
Visit the Lightning Experience Theater in the Campground
NEW
more time with your customers
25%
more time with your customers
25%
faster decision making
+48%For longer text
25% 25%
Text here
Text here
25%
25%
Design ElementsNature
Housing
Equipment
Animals (Animals should always be smaller than the official Mascots)
Transportation
Lighting
Shadow(Resize and layer as needed)
iPadAir Vertical 2
iPhone6 Galaxy 7 Apple Watch
iPadAir Horizontal 2
LG3
Devices
Macbook Pro Retina Display(Browser bar is removable)
Thunderbolt Monitor for Desktop(Browser bar is removable)
Dell Laptop
Dell Monitor
Best Practice for Importing Slides
If you import slides from an older deck into this Google Slides template the content will not link and align perfectly, even with the appropriate layout selected.
To ensure slides are consistent within the template and all slides use the same spacing and alignment, it is recommended that you create a blank new slide after you copy in a slide from one deck to the other. Once you have these side by side retype or paste the title and subtitle into your final presentation.
Then go back, copy and paste the remaining content and graphics into the slide directly. Once you have recreated the slide in your deck, you can delete the original.
Note: Even if an imported slide looks similar to the template, double check that the titles align. (The only way to ensure that a slide is actually correct is to rebuild the slide starting with a blank layout.)
Five Alternatives to Bulleted ListsSalesforce tries to avoid using bulleted lists when possible
1. Paragraph Line Spacing
Motivate to get things doneInspire by tracking goalsScore to ensure right priority
Motivate to get things done
Inspire by tracking goals
Score to ensure right priority
2. Bold first word (note wording must be carefully constructed)
Motivate to get things done
Inspire by tracking goals
Score to ensure right priority
3. Columns (note wording must be carefully constructed)
Inspire by tracking goals
Score to ensure right priority
Motivate to get things done
4. Paragraph Heading or Word Heading
5. Graphic instead of Bullet
Motivateto get things done
Inspireby tracking goals
Scoreto ensure right priority
Motivate to get things done
Inspire by tracking goals
Score to ensure right priority
How To Apply a Google Theme (Template)to An Existing Presentation
You can apply a Template Theme to an existing deck. Note that it is not a 1 to 1 translation so many things will be off. (Especially subtitles, left margin, font colors, etc. )
We recommend you rebuild a deck into the new template one slide at a time to ensure accuracy.
If you do Import the Theme, you still will need to through each slide to ensure layout is consistent with the Google Slides Corporate Template.
Refer to the steps on right
1
2
3
1. Slide Tab
2. Change Theme
Import theme
Selectoriginal template
Go through each slide carefully.
4
1
2
3
4
4
All Text boxes default to 20 points without bullets on the slide and default to the darker gray
To create second level paragraph, change text manually to 18 points and select theme color gray
Use a soft return for creating the next paragraph (shift + enter will limit the spacing size)
You can press a hard return to create larger spaces between your paragraphs or go to: Line spacing>Custom Spacing and set them manually
All layouts have a place for a subtitle and should always be sentence case Working with Text and Content
● Manually add a bullet by selecting bullet icon on toolbar and pressing return○ Once you have added bullets you can press the indent icon button
for the second level bullet
Working with Bullets on the SlideHow to add bullets to a basic layout
NOTE: If you don’t want both levels to have a bullet, you can delete a bullet by pressing backspace until it is gone.
Step 1
Step 2
Keyboard ShortcutsObjectsMove Object One Pixel: Shift + Arrow Keys (PC: Cntrl + Arrow Keys)Group: Cmnd + G (PC: Cntrl + G)Ungroup: Cmnd + Shift + G (PC: Cntrl + Shift + G)Send Backwards: Cmnd + Bring Forwards: Cmnd + Textbox into Shape: Esc (Once you press escape you can move a text boxs with the arrow keys)
TextSoft return: Shift + EnterBold Text: Cmnd + B (PC: Cntrl + B)Align Left: Cmnd + Shift + L (PC: Cntrl + Shift + L)Center Align: Cmnd + Shift + E (PC: Cntrl + Shift + E)Right Align: Cmnd + Shift + R (PC: Cntrl + Shift + R)Repeat Last Action: Cmnd + Y (PC: Cntrl + Y)Paste Unformatted Text: Cmnd + Shft + V (PC: Cntrl + Shft + V)
SlidesNew Slide: Cntrl + M
Drawing Guides Alignment Tool (Margins)
The left and right top and bottom corners only area you should work
within on your slides.Go into Master Layouts Copy and Paste These Lines into your Working Slides to Create GuidesSince Google slides does not allow you add Drawing Guides, you can go into this Master slide, copy these orange lines, paste them into the slide you are working on, align content and delete.