Inside the Force.com Query Optimizer Webinar

57
Inside the Force.com Query Optimizer From salesforce.coms Customer Centric Engineering – Technical Enablement team

description

The Salesforce Platform allows developers to build enterprise applications using Visualforce, Apex and SOQL. To ensure that your applications perform and scale as your business grows, you'll want to write efficient and selective queries. The Force.com query optimizer uses several algorithms to determine the best SQL to generate from your SOQL. Some factors involved in this process include multitenancy, metadata and indexes. Watch this webinar to: Get an overview of multitenancy and metadata Understand how to write selective and scalable SOQL queries Learn how the Force.com query optimizer converts SOQL to SQL See examples of the performance impact of indexes Find out how skinny tables work

Transcript of Inside the Force.com Query Optimizer Webinar

Page 1: Inside the Force.com Query Optimizer Webinar

Inside the Force.com Query Optimizer From salesforce.com’s Customer Centric Engineering – Technical Enablement team

Page 2: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar

Safe harbor Safe harbor statement under the Private Securities Litigation Reform Act of 1995: This presentation may contain forward-looking statements that involve risks, uncertainties, and assumptions. If any such uncertainties materialize or if any of the assumptions proves incorrect, the results of salesforce.com, inc. could differ materially from the results expressed or implied by the forward-looking statements we make. All statements other than statements of historical fact could be deemed forward-looking, including any projections of product or service availability, subscriber growth, earnings, revenues, or other financial items and any statements regarding strategies or plans of management for future operations, statements of belief, any statements concerning new, planned, or upgraded services or technology developments and customer contracts or use of our services. The risks and uncertainties referred to above include – but are not limited to – risks associated with developing and delivering new functionality for our service, new products and services, our new business model, our past operating losses, possible fluctuations in our operating results and rate of growth, interruptions or delays in our Web hosting, breach of our security measures, the outcome of intellectual property and other litigation, risks associated with possible mergers and acquisitions, the immature market in which we operate, our relatively limited operating history, our ability to expand, retain, and motivate our employees and manage our growth, new releases of our service and successful customer deployment, our limited history reselling non-salesforce.com products, and utilization and selling to larger enterprise customers. Further information on potential factors that could affect the financial results of salesforce.com, inc. is included in our annual report on Form 10-Q for the most recent fiscal quarter ended July 31, 2012. This documents and others containing important disclosures are available on the SEC Filings section of the Investor Information section of our Web site. Any unreleased services or features referenced in this or other presentations, press releases or public statements are not currently available and may not be delivered on time or at all. Customers who purchase our services should make the purchase decisions based upon features that are currently available. Salesforce.com, inc. assumes no obligation and does not intend to update these forward-looking statements.

Page 3: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar Join the conversation: #forcewebinar

John Tan

Architect Evangelist @johntansfdc

Jaikumar Bathija

Architect – DB Performance @

Speakers

Page 4: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar

Follow Developer Force for the latest news

@forcedotcom / #forcewebinar

Developer Force group

Developer Force – Force.com Community

+Developer Force – Force.com Community

Developer Force

Page 5: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar

Architect Core Resource page

•  Featured content for architects •  Articles, papers, blog posts, events

•  Follow us on Twitter

Updated weekly!

http://developer.force.com/architect

Page 6: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar Join the conversation: #forcewebinar

Have questions?

§  We have an expert support team at the ready to answer your questions during the webinar.

§  Ask your questions via the GoToWebinar Questions Pane.

§  The speaker(s) will choose top questions to answer live at the end of the webinar.

§  Please post your questions as we go along!

§  Only post your question once; we’ll get to it as we go down the list.

Page 7: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar Join the conversation: #forcewebinar

Today’s Learning Goal

AWARENESS

Page 8: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar Join the conversation: #forcewebinar

Why are we here?

SELECT Id

FROM Account WHERE Status__c != ‘Closed’ AND

Rating = Null AND CreatedDate > 2013-04-01

Empower developers to write selective queries. Don’t worry we have lots of examples.

Page 9: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar

Selective Filters

Reduces the number of

records in your result set. �

Leverages indexes. �

Avoids full table scans. �

Page 10: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar

Query Performance Impact

User Experience (Visualforce pages, API, Reports,

Listviews, etc). �

Governor Limits (Timeouts, Concurrent

Request Limit, Concurrent API

limit, etc). �

Large Data Volumes (LDV). �

Page 11: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar Join the conversation: #forcewebinar

Page 12: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar Join the conversation: #forcewebinar

Agenda

•  Design - http://developer.force.com/architect

•  Query Optimizer

•  SOQL Examples

•  Skinny Tables •  Other Performance Factors

Page 13: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar Join the conversation: #forcewebinar

Query Optimizer

Page 14: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar

Query execution

Page 15: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar

Multitenancy

Page 16: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar

Basic Algorithm

•  Pre-Query engine.

•  Chooses the most selective filter from the WHERE clause. •  Determine the best leading table/index to drive the query.

Page 17: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar

Indexing

•  Standard index - is available out of the box and we have a whole bunch of fields that are indexed on Standard and custom entities.

•  Custom index – is created on-demand, based on performance analysis done pro-actively by salesforce team.

•  What other fields are indexed – External Id fields, fields marked unique, foreign keys by way of lookup or master detail relationship.

Page 18: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar

Statistics

•  Pre-computed Statistics §  Row count §  User visibility

§  Custom index

§  Owner row count

Page 19: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar

Options considered by the Optimizer

Page 20: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar

The numbers game

•  Standard index will be considered only if the filter fetches < 30% of the records for the first million records and less than 15% of the records after the first million records, up to 1M records. * The selectivity threshold is subject to change.

•  Custom index will be considered only if the filter fetches < 10% of the records for the first million records and less than 5% of the records after the first million records, up to 333,333 records. * The selectivity threshold is subject to change.

Page 21: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar

The numbers game – Standard Index

# of records First Threshold Second Threshold Final Threshold

Up to 1 million 30% of total N/A 30% of total

Up to 2 million 300,000 150,000 450,000

Up to 3 million 300,000 300,000 600,000

Up to 4 million 300,000 450,000 750,000

Up to 5 million 300,000 600,000 900,000

Above 5.6 million 300,000 700,000 1,000,000

Page 22: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar

The numbers game – Custom Index

# of records First Threshold Second Threshold Final Threshold

Up to 1 million 10% of total N/A 10% of total

Up to 2 million 100,000 50,000 150,000

Up to 3 million 100,000 100,000 200,000

Up to 4 million 100,000 150,000 250,000

Up to 5 million 100,000 200,000 300,000

Above 5.6 million 100,000 233,333 333,333

Page 23: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar

Other Optimizations

AND optimizations §  Composite Index Join - INTERSECTION of indexes should still meet

selectivity threshold.

OR optimizations §  Union - SUM of the filters should still meet selectivity threshold.

sort optimizations §  an index aligns with our order by clause and the query has a row limit,

we can use the index to find the first rows quickly and exit.

Page 24: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar Join the conversation: #forcewebinar

Examples

Page 25: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar

Schema

MyCase – 100,000 Records MyUser – 100,000 Records

Page 26: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar

Reminder: Goal of Optimizer

•  Generate efficient SQL

•  Leverage an index to drive query •  Avoid full table scans

Query Optimizer cannot make up for non-selective filters. It will make the best choice from the filters in your query.

Page 27: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar

Selectivity

SELECT Id FROM MyCase__c

WHERE Status__c = ‘Closed’ will be do a full table scan SELECT Id FROM MyCase__c

WHERE Status__c = ‘New’ will use the index

Indexed Field Value # of Records Selective? Status Closed 96,500 No Status New 3,500 Yes

Total # Records = 100,000 Selectivity Threshold = 10,000

Page 28: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar

Not Equals / Not In

Can’t use index because of not equals SELECT Id FROM MyCase__c

WHERE Priority__c != 3 will do a full table scan SELECT Id FROM MyCase__c

WHERE Priority__c IN (1,2) will use the index

Indexed Value # of Records Selective? Priority 1 6,000 Yes Priority 2 3,500 Yes Priority 3 90,500 No

Total # Records = 100,000 Selectivity Threshold = 10,000

Page 29: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar

Formula Fields

Field Type Formula CaseType__c Formula

(Text) CASE(MyUser__r.UserType__c,1,”Gold”,”Silver”)

Can’t create an index on CaseType__c since this formula spans objects

IF MyUser__r.UserType__c has an index

•  SELECT Id FROM MyCase__c WHERE MyUser__r.UserType__c = 1

Page 30: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar

Formula Fields

Field Type CaseTypeClone__c Text(255)

Or avoid a join and create CaseTypeClone__c field and index it •  SELECT Id FROM MyCase__c WHERE CaseTypeClone__c = ‘Gold’

Page 31: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar

Indexed Field Value # of Records Selective?

ClosedDate Non-Null 96,500 Yes for specific dates ClosedDate Null 3,500 Yes

Nulls

Customer Support will need to create a custom index that includes null records. Standard indexes by default include nulls.

SELECT Id FROM MyCase__c WHERE ClosedDate__c = null will use the index

http://blogs.developerforce.com/engineering/2013/02/force-com-soql-best-practices-nulls-and-formula-fields.html

Total # Records = 100,000 Selectivity Threshold = 10,000

Page 32: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar

Date & Number Range

SELECT Id FROM MyCase

WHERE ClosedDate__c > 2013-01-01 AND ClosedDate__c < 2013-02-01

Query Optimizer can detect only date and number ranges.

Page 33: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar

AND conditions

Composite Index Join

SELECT Id FROM MyUser

WHERE FirstName__c = ‘Jane’ AND LastName__c = ‘Doe’ AND City__c = ‘San Francisco’

Step 1 – Allow each index to still be considered if they return < 2X selectivity threshold

Step 2 – INTERSECTION of all indexes must meet *selectivity threshold

Step 3 – Use composite index join to drive query

*If all indexes are standard indexes, use standard index selectivity threshold. Otherwise, use the custom index standard selectivity threshold

Page 34: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar

AND conditions

Composite Index Join – MyUser object 100,000 records

Page 35: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar

AND conditions

Composite Index Join – MyUser object 100,000 records

Page 36: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar

2-column Index

For this simple example, it makes more sense to have Customer Support create a 2-column index.

Page 37: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar

OR conditions

Union

SELECT Id FROM MyUser

WHERE FirstName__c = ‘Jane’ OR LastName__c = ‘Doe’ OR City__c = ‘San Francisco’

Step 1 – Each field must be indexed and meet selectivity threshold

Step 2 – ADDITION of all the indexes must meet *selectivity threshold

Step 3 – Use union to drive query

*If all indexes are standard indexes, use standard index selectivity threshold. Otherwise, use the custom index standard selectivity threshold

Page 38: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar

OR conditions

Union – MyUser object 100,000 records

Page 39: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar

OR conditions

Using SOSL may be a better option •  SELECT Id FROM Account WHERE PersonMobilePhone LIKE ‘%123’ – leading %

wildcard as bad as full scan

•  SELECT Id FROM Account WHERE PersonMobilePhone = ‘1234567890’ OR PersonHomePhone = ‘1234567890’ OR Phone = ‘1234567890’

Page 40: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar

Relationship

Relationship

SELECT Id FROM MyCase__c

WHERE MyUser__r.JobType = 1 AND Priority__c = ‘Priority 1’

Each index’s selectivity threshold is analyzed separately and the index with the lower threshold % is chosen.

Page 41: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar

Soft Deletes

•  Records in the Recycle Bin with isDeleted = true

•  DO NOT USE isDeleted = false as a filter •  Counted in pre-computed statistics

•  Use hard delete option in Bulk API or Contact Customer Support

Page 42: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar

Sort Optimization

•  Number and Date fields only •  Limit Clause required •  Can make up for a non-selective filter

SELECT Id FROM MyCase__c ORDER BY CreatedDate LIMIT 10 SELECT Id FROM MyCase__c WHERE CreatedDate > 2001-01-01 ORDER BY CreatedDate LIMIT 10

Page 43: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar Join the conversation: #forcewebinar

Review

ü Selectivity thresholds determine if an index is considered ü Not Equals filters will not leverage indexes

ü Be careful filtering on Null ü And conditions involve an INTERSECTION of indexes ü OR conditions involve an ADDITION of indexes

ü ORDER BY with a LIMIT on an index can make up for non-selective filters

Page 44: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar Join the conversation: #forcewebinar

Sharing

Page 45: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar

Record Visibility

•  Applies only to non-Admin users.

•  Depending on your user profile, you may have visibility to few or large number of records.

•  Sharing tables may drive query instead of index

Page 46: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar Join the conversation: #forcewebinar

Skinny Table

Page 47: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar

Skinny Table

Page 48: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar

Skinny Tables

•  Single Object

•  Maximum of 100 fields

•  Not Aggregate/Summary. 1:1 record count between source object and skinny

•  It is not a cross-object join

•  Updates to source object automatically reflected in skinny

•  Improved performance – minimal joins since fields are in one table

Page 49: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar

Skinny Table

Page 50: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar Join the conversation: #forcewebinar

When are Skinny Tables used?

ü After attempting to tune with custom indexes ü All fields selected and filtered must be in skinny

ü Salesforce.com will analyze and create

Page 51: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar Join the conversation: #forcewebinar

Other Performance Factors

Page 52: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar

Performance Factors

Sharing §  Test as a non-System Admin User

Data Skews §  Avoid parent-child and ownership data skews

Archiving

Database Caching §  Avoid relying on cache performance or attempting to warm the cache

Page 53: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar Join the conversation: #forcewebinar

Key Takeaways

ü Query Performance improves with indexes ü Use selective filters to reduce result set

ü Query Optimizer chooses the best table/index to drive a query

ü Skinny Tables may help when indexing is exhausted

Page 54: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar

Cheat Sheet: Indexed Fields

http://developer.force.com/architect

Query & Search Optimization

Cheat SheetDatabase

http://developer.force.com

Query Optimization OverviewWhen building queries, list views, and reports, it's best to create filter conditions that are selective so that Force.com scans the most appropriate rows in the objects that your queries target. This best practice is especially important when your queries target "large objects," objects containing more than one million records.

When writing SOQL, consider using the following fields, which can make your query filter conditions more selective, and improve your query response times and your database's overall performance.

Selectivity OverviewSeveral things can affect the selectivity of a query filter's conditions.

��Whether the field in the condition has an index��Whether the value in the condition is selective relative to the total number of records in the

object. These numbers determine the selectivity threshold, which the Force.com query optimizer uses to ensure that the most appropriate index, if any, drives each of your queries.

��Whether the operator in the condition permits the use of available indexes

When writing your queries, remember the following selectivity conditions and tips.

SOQL

Fields with Database Indexes

Primary

Keys

Foreign

Keys

Audit

Dates

Custom

Fields

��Id��Name��OwnerId

��CreatedById��LastModifiedById��Lookup fields��Master-detail

relationship fields

��CreatedDate��LastActivityDate��SystemModstamp

��Unique fields��External ID

fields

Index Selectivity Conditions and Thresholds

Unary Condition:

Standard Index

Unary Condition:

Custom Index

AND

Condition

OR

Condition

LIKE

Condition

Force.com uses a standard index if the filter targets less than:

��30% of the first million records

��15% of all records after the first million records

��1 million total records

Force.com uses a custom index if the filter targets less than:

��10% of the first million records

��5% of all records after the first million records

��333,333 total records

Force.com uses a composite index join if the filter targets less than:

��Twice the index selectivity thresholds for each field

��The index selectivity thresholds for the intersection of those fields

Force.com uses a union if the filter targets less than:

��The index selectivity thresholds for each field

��The index selectivity thresholds for the sum of those fields

For conditions that don't start with a leading wildcard, Force.com tests the first 100,000 rows for selectivity.

Query Optimization ResourcesIn addition to this cheat sheet's previous sections, we recommend reading the following related resources, which can help you retrieve the records you want from a large volume of data—and do so quickly and efficiently.

��Best Practices for Deployments with Large Data Volumes (white paper)��Force.com Apex Code Developer's Guide (guide)��Force.com Blogs: Engineering (blog posts)��How to Improve Listview Performance (Salesforce Knowledge article)��In the online help:

» "Build Effective Filters"

» "Getting the Most Out of Filter Logic"

» "Improve Report Performance"

Index Selectivity ExceptionsWhen you build a filter condition with the following operators, Force.com doesn't use an available index. Instead, it scans all records in the object to find the records that satisfy the condition. Feel free to use these operators, but be sure to add selective filter conditions.

��The following filter operators » not equal to

» contains

» does not contain

��When used with text and text fields, the following comparison operators�» /HVV�WKDQ�(<)

�» *UHDWHU�WKDQ (>)

�» /HVV�WKDQ�RU�HTXDO�WR (<=)

�» *UHDWHU�WKDQ�RU�HTXDO�WR (>=)

Additionally, Force.com doesn't use available indexes when you use:

��Leading wildcards��Non-deterministic or cross-object formula fields

SOSL

Fields with Search Indexes Search Selectivity Tips

General Sidebar Search and Advanced Search

��Be as selective as possible. For example, use Michael*, not Mich*.

��Remember that Chatter feed searches aren't affected by the scope of your search; Chatter feed search results include matches across all objects.

��Search for the exact phrase with an advanced search.��Limit scope by targeting:

» Specific objects

» Rows owned by the searcher

» Rows within a division, when applicable

See "Search Overview" in the online help.

General

��Name fields��Phone fields��Text fields��Picklist fields

These fields vary by object. See "Search Fields" in the online help.

Page 55: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar Join the conversation: #forcewebinar

Upcoming Events April 21-27, 2013 Salesforce Mobile Developer Week May 8, 2013 Summer ‘13 Release Developer Preview Webinar May 9, 2013 SOQL Best Practices CodeTalk

Page 56: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar Join the conversation: #forcewebinar

Survey Your feedback is crucial to the success of our webinar programs.

Thank you!

http://bit.ly/querysurvey

*Look in the GoToWebinar chat window now for a hyperlink.

Page 57: Inside the Force.com Query Optimizer Webinar

Join the conversation: #forcewebinar Join the conversation: #forcewebinar

John Tan

Architect Evangelist @johntansfdc

Jaikumar Bathija

Architect – DB Perfomance

Q&A