Application Performance: Database-related Problems

65
www.luxoft.c om APPLICATION PERFORMANCE: DATABASE-RELATED PROBLEMS Evgeniy Khyst 26.04.2016

Transcript of Application Performance: Database-related Problems

Page 1: Application Performance: Database-related Problems

www.luxoft.com

APPLICATION PERFORMANCE: DATABASE-RELATED PROBLEMSEvgeniy Khyst26.04.2016

Page 2: Application Performance: Database-related Problems

www.luxoft.com

Application Performance: Database-related Problems● Application performance;● Common performance problems and their solutions;● Database-related problems;● Lock contention;● Locking mechanism;● Transaction isolation level;● URL shortener example;● Hi/Lo algorithms;● Payment system example.

Page 3: Application Performance: Database-related Problems

www.luxoft.com

Application Performance

● Key performance metrics:­ Request processing time;­ Throughput;

● Poor performance:­ Long time to process single requests;­ Low number of requests processed per second.

Page 4: Application Performance: Database-related Problems

www.luxoft.com

Request Processing Time

Request processing time = 4 seconds

Page 5: Application Performance: Database-related Problems

www.luxoft.com

Throughput

Throughput = 1/4 req/sec = 15 req/min

Page 6: Application Performance: Database-related Problems

www.luxoft.com

Throughput

Throughput = 3/4 req/sec = 45 req/min

Page 7: Application Performance: Database-related Problems

www.luxoft.com

Throughput

Throughput = 10/4 req/sec = 150 req/sec

Page 8: Application Performance: Database-related Problems

www.luxoft.com

Common Performance Problems and Their Solutions

● Database-related problems;● JVM performance problems;● Application specific performance problems;● Network-related problems.

Page 9: Application Performance: Database-related Problems

www.luxoft.com

Database-related Performance Problems

● Query execution time is too big;● Too much queries per single business function;● Database connection management problems.

Page 10: Application Performance: Database-related Problems

www.luxoft.com

Query Execution Time is Too Big

● Missing indexes;● Slow SQL queries (sub-queries, too many JOINs etc);● Slow SQL queries generated by ORM;● Not optimal JDBC fetch size;● Not parameterized statements for queries;● Lack of proper data caching;● Lock contention.

Page 11: Application Performance: Database-related Problems

www.luxoft.com

Missing Indexes

To find out what indexes to create look at query execution plan.

In Oracle database it is done as follows:EXPLAIN PLAN FOR SELECT isbn FROM book;

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY());

Page 12: Application Performance: Database-related Problems

www.luxoft.com

TABLE ACCESS FULL

● Full table scan is a scan made on a database where each row of the table under scan is read in a sequential order and the columns encountered are checked for the validity of a condition;

● Full table scans are the slowest method of scanning a table in most of the cases;

● Create missing indexes to search by index instead of performing full table scan.

Page 13: Application Performance: Database-related Problems

www.luxoft.com

Slow SQL Queries

● Slow SQL queries (sub-queries, too many JOINs etc):Solution: Rewrite query

● Slow SQL queries generated by ORM:­ JPQL/HQL and Criteria API queries are translated to SQL;Solutions:­ Rewrite JPQL/HQL, Criteria API queries;­ Replace with plain SQL query.

Page 14: Application Performance: Database-related Problems

www.luxoft.com

Not Optimal JDBC Fetch Size

JDBC allows to specify the number of rows fetched with each database round-trip for a query, and this number is referred to as the fetch size.

Solutions:● java.sql.Statement.setFetchSize(rows)

● hibernate.jdbc.fetch_size property

Page 15: Application Performance: Database-related Problems

www.luxoft.com

Not Parameterized Statements for Queries

When a database receives SQL statement it:●parses the statement and looks for syntax errors,●does the access plan generation (checks what indexes can

be used etc),●executes statement.

Problem: Access plan generation takes CPU power.

Page 16: Application Performance: Database-related Problems

www.luxoft.com

Not Parameterized Statements for Queries

●Database caches computed access plan;●JEE application servers cache PreparedStatement

instances.

Solution: Reusing the previous access plan or PreparedStatement saves CPU power.

Page 17: Application Performance: Database-related Problems

www.luxoft.com

Not Parameterized Statements for Queries

The entire statement is the key in cache.

SELECT * FROM tbl WHERE name='a' andSELECT * FROM tbl WHERE name='b'

will have different entries in cachesbecause the name='b' is different from the cached name='a'.

Page 18: Application Performance: Database-related Problems

www.luxoft.com

Not Parameterized Statements for Queriesfor (String name : names) { Statement stmt = conn.createStatement(); ResultSet rs = stmt.executeQuery("SELECT * FROM tbl WHERE name = " + name); /* … */}

The cache won't be used, a new access plan is computed for each iteration.

PreparedStatement ps = conn.prepareStatement("SELECT * FROM tbl WHERE name = ?");for (String name : names) { ps.setString(1, name); ResultSet rs = ps.executeQuery(); /* … */}

Database reuses the access plan for the statement parameterized using the '?'.

Page 19: Application Performance: Database-related Problems

www.luxoft.com

Lack of Proper Data Caching

Solutions:● Enable ORM second-level cache;● Enable ORM query cache;● Implement custom cache.

Page 20: Application Performance: Database-related Problems

www.luxoft.com

Lock Contention

Operations are waiting to obtain lock for a long time due to high lock contention.

Solution:Revise application logic and implementation:● Update asynchronously;● Replace updates with inserts (inserts are not blocking).

Page 21: Application Performance: Database-related Problems

www.luxoft.com

Too Much Queries per Single Business Function

● Insert/update queries executed in a loop;● "SELECT N+1" problem;● Reduce number calls hitting database.

Page 22: Application Performance: Database-related Problems

www.luxoft.com

Insert/Update Queries Executed in a Loop

● Use JDBC batch (keep batch size less than 1000);● hibernate.jdbc.batch_size property;● Periodically flush changes and clear

Session/EntityManager to control first-level cache size.

Page 23: Application Performance: Database-related Problems

www.luxoft.com

JDBC Batch ProcessingPreparedStatement preparedStatement = connection.prepareStatement("UPDATE book SET title=? WHERE isbn=?");

preparedStatement.setString(1, "Patterns of Enterprise Application Architecture");preparedStatement.setString(2, "007-6092019909");

preparedStatement.addBatch();

preparedStatement.setString(1, "Enterprise Integration Patterns");preparedStatement.setString(2, "978-0321200686");

preparedStatement.addBatch();

int[] affectedRecords = preparedStatement.executeBatch();

for (int i=0; i<100000; i++) {    Book book = new Book(.....);    session.save(book);    if ( i % 20 == 0 ) { // 20, same as the JDBC batch size       // flush a batch of inserts and release memory:       session.flush();       session.clear();    }}

Page 24: Application Performance: Database-related Problems

www.luxoft.com

"SELECT N+1" Problem

● The first query will selected root entities only, and each associated collection will be selected with additional query.

● So persistence provider generates N+1 SQL queries, where N is a number of root entities in result list of user query.

Page 25: Application Performance: Database-related Problems

www.luxoft.com

"SELECT N+1" Problem

Solutions:● Use different fetching strategy or entity graph;● Make child entities aggregate roots and use DAO methods

to fetch them:­ Replace bidirectional one-to-many mapping with unidirectional;

● Enable second-level and query cache.

Page 26: Application Performance: Database-related Problems

www.luxoft.com

Reduce Number Database Calls

Solutions:● Use Hi/Lo algorithms;● Enable ORM second-level cache;● Enable ORM query cache;● Implement custom cache.

Page 27: Application Performance: Database-related Problems

www.luxoft.com

Database Connection Management Problems

● Application is using too much DB connections:­ Application is not closing connections after usingSolution: Close all connections after using­ DB is not able to handle that much connections application uses Solution: Use connection pooling

● Application is waiting to get connection from pool too longSolution: Increase pool size

Page 28: Application Performance: Database-related Problems

www.luxoft.com

JVM Performance Problems

Excessive JVM garbage collections slows down application.Solutions:● Analyze garbage collector logs:

­ Send GC data to a log file, enable GC log rotation:-Xloggc:gc.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=1M -XX:+PrintGCTimeStamps

● Tune GC:­ Use Garbage-First Collector: -XX:+UseG1GC

Page 29: Application Performance: Database-related Problems

www.luxoft.com

Application Specific Performance Problems

Resource consuming computations: ● Algorithms with complexity O(N2), O(2N);● Asymmetric RSA encryption;● Bcrypt hashing during authentication;● Etc.

Solution: Horizontal scalability. Increase number of instances capable of processing requests and balance load (create cluster).

Page 30: Application Performance: Database-related Problems

www.luxoft.com

Network-related Problems

● Network latency;● Not configured timeout:

­ mail.smtp.connectiontimeout Socket connection timeout. Default is infinite timeout.

­ mail.smtp.timeout Socket read timeout. Default is infinite timeout.

Page 31: Application Performance: Database-related Problems

www.luxoft.com

Reducing Lock Contention

● Database-related problems­ Query execution time is too big

• Lock contention

Solutions:● Use Hi/Lo algorithms;● Update asynchronously;● Replace updates with inserts.

Page 32: Application Performance: Database-related Problems

www.luxoft.com

Locking Mechanism

Locks are mechanisms that prevent destructive interaction between transactions accessing the same resource.

In general, multi-user databases use some form of data locking to solve the problems associated with:● data concurrency,● consistency,● integrity.

Page 33: Application Performance: Database-related Problems

www.luxoft.com

Isolation Levels vs Locks

● Transaction isolation level does not affect the locks that are acquired to protect data modifications.

● A transaction always gets an exclusive lock on any data it modifies and holds that lock until the transaction completes, regardless of the isolation level set for that transaction.

● For read operations transaction isolation levels primarily define the level of protection from the effects of modifications made by other transactions.

Page 34: Application Performance: Database-related Problems

www.luxoft.com

Preventable Read Phenomena

● Dirty reads - A transaction reads data that has been written by another transaction that has not been committed yet.

● Nonrepeatable reads - A transaction rereads data it has previously read and finds that another committed transaction has modified or deleted the data.

● Phantom reads - A transaction reruns a query returning a set of rows that satisfies a search condition and finds that another committed transaction has inserted additional rows that satisfy the condition.

Page 35: Application Performance: Database-related Problems

www.luxoft.com

Standard Transaction Isolation Levels

● Read uncommited● Read commited● Repeatable reads● Serializable

Page 36: Application Performance: Database-related Problems

www.luxoft.com

Isolation Levels vs Read Phenomena

Dirty reads Nonrepeatable reads Phantom reads

Read uncommited Possible Possible Possible

Read commited Not­possible Possible Possible

Repeatable reads Not­possible Not­possible Possible

Serializable Not­possible Not­possible Not­possible

Page 37: Application Performance: Database-related Problems

www.luxoft.com

Default Isolation Level

Read commited isolation level is default.

Page 38: Application Performance: Database-related Problems

www.luxoft.com

Read Commited Isolation Level

In read commited reads are not blocking.

Page 39: Application Performance: Database-related Problems

www.luxoft.com

Read Commited Isolation Level

Conflicting writes in read commited transactions.

Page 40: Application Performance: Database-related Problems

www.luxoft.com

Pessimistic and optimistic locking are concurrency control mechanisms.

Pessimistic locking is a strategy when you lock record when reading and then modify:

SELECT name FROM tbl FOR UPDATE;UPDATE tbl SET name = 'new value';

Optimistic locking is a strategy when you read record with version number and then check this version when updating:SELECT name, version FROM tbl;

UPDATE tbl SET name = 'new value', version = version + 1 WHERE version = :version;

Pessimistic and Optimistic Locking

Page 41: Application Performance: Database-related Problems

www.luxoft.com

● Pessimistic locking prevents lost updates and makes updates serial (FIFO) reducing throughput;

● Optimistic locking just prevents lost updates;● If version check in optimistic locking fails, read and update

queries should be re-executed;● Optimistic locking allows to reduce time the lock is held and

sometimes increases throughput.

Pessimistic and Optimistic Locking

Page 42: Application Performance: Database-related Problems

www.luxoft.com

URL Shortener Example

Requirements:● Receives URL and returns "shortened" version;● E.g. post "http://github.com" to "http://url-shortener/s/"

and get back "http://url-shortener/s/2Bi";● The shortened URL can be resolved to original URL. E.g.

"http://url-shortener/s/2Bi" will return "http://github.com";● Shortened URLs that were not accessed longer than some

specified amount of time should be deleted.

Page 43: Application Performance: Database-related Problems

www.luxoft.com

URL Shortener Example

● Each time URL is submitted a new record is inserted into the database;

● Insert operations do not introduce locks in database;● For primary key generation database sequence is used;● The Hi/Lo algorithm allows to reduce number of database

hits to improve performance.

Page 44: Application Performance: Database-related Problems

www.luxoft.com

URL Shortener Example

● Original URL’s primary key is converted to radix 62:­ Radix 62 alphabet contains digits lower- and upper-case letters:

10000 in radix 10 = 2Bi in radix 62;● String identifying original URL is converted back to radix

10 to get primary key value and original URL can be found by ID.

Page 45: Application Performance: Database-related Problems

www.luxoft.com

URL Shortener Example

E.g. URL "http://github.com/" shortened to "http://url-shortener/s/2Bi":● Inserting new record to database with id 10000 for original

URL "http://github.com/" representing "shortened" URL● Converting id 10000 to radix 62: 2Bi

Page 46: Application Performance: Database-related Problems

www.luxoft.com

URL Shortener Example

● During each shortened URL resolving last view timestamp is updated in database and total number of views column is incremented;

● These update should be asynchronous to not reduce performance due to lock contention;

● Absence of update operations gives application better scalability and throughput.

Page 47: Application Performance: Database-related Problems

www.luxoft.com

Update Asynchronously

● When URL is resolved JMS message is sent to queue;● Application consumes messages from queue and updates

records in database;● During URL resolving there are no update operations.

Page 48: Application Performance: Database-related Problems

www.luxoft.com

Hi/Lo Algorithms

The usage of Hi/Lo algorithm allows different application nodes not to block each other.

Page 49: Application Performance: Database-related Problems

www.luxoft.com

Hi/Lo Algorithms

JPA mapping:@SequenceGenerator(name = "MY_SEQ", sequenceName = "MY_SEQ",

allocationSize = 50)

allocationSize = N - fetch the next value from the database once in every N persist calls and locally (in-memory) increment the value in

between.

Sequence DDL:CREATE SEQUENCE MY_SEQ INCREMENT BY 50 START WITH 50;

INCREMENT BY should match allocationSizeSTART WITH should be greater or equal to allocationSize

Page 50: Application Performance: Database-related Problems

www.luxoft.com

Payment System Example

Requirements:● Users can add funds on their accounts (add funds)● Users can pay to shops with funds from their accounts

(payment)● Users and shops can withdraw money from their accounts

(withdraw funds)● Account balance must be always up to date

Page 51: Application Performance: Database-related Problems

www.luxoft.com

Simple Solution 1

● Store account balance in table and update on each operation.

● Advantage: Simple

Page 52: Application Performance: Database-related Problems

www.luxoft.com

Simple Solution 1 - Data Model

Table ACCOUNT_BALANCE

ACCOUNT_ID BALANCE

Page 53: Application Performance: Database-related Problems

www.luxoft.com

Simple Solution 1 - Queries

UPDATE ACCOUNT_BALANCE SET BALANCE = BALANCE + :amount WHERE ACCOUNT_ID = :account

SELECT ACCOUNT_ID, BALANCE FROM ACCOUNT_BALANCE WHERE ACCOUNT_ID = :account

Page 54: Application Performance: Database-related Problems

www.luxoft.com

Simple Solution 1 - Problems

● Update operations introduce locks;● During Christmas holidays users can make hundreds of

payments simultaneously;● Due to lock contention payments will be slow;● System have low throughput.

Page 55: Application Performance: Database-related Problems

www.luxoft.com

Simple Solution 2

● Do not store account balance at all;● Store details of each transaction;● Calculate balance dynamically based on transaction log;● Advantages:

­ Still simple enough;­ No update operations at all.

Page 56: Application Performance: Database-related Problems

www.luxoft.com

Simple Solution 2 - Data Model

Table TRANSACTION_LOG

TX_ID TX_TYPE TX_STATUS TX_DATE ACCOUNT_ID TX_AMOUNT

Page 57: Application Performance: Database-related Problems

www.luxoft.com

Simple Solution 2 - Queries

● Payment and withdrawal are 2-step operations:­ Authorization step;­ Fulfillment step;

● First, authorization step is done in separate transaction;● Next, balance check and fulfillment step are done in other

transaction.

Page 58: Application Performance: Database-related Problems

www.luxoft.com

Simple Solution 2 - Queries

-- Authorization in new transactionINSERT INTO TRANSACTION_LOG(TX_ID, TX_TYPE, TX_STATUS, TX_DATE, ACCOUNT_ID, TX_AMOUNT) VALUES(:id, :type, 'AUTHORIZED', :date, :account, :amount)

-- Balance check and fulfillment in new transactionSELECT ACCOUNT_ID, SUM(TX_AMOUNT) AS BALANCE FROM TRANSACTION_LOG WHERE ACCOUNT_ID = :account

UPDATE TRANSACTION_LOG SET TX_STATUS = 'FULFILLED' WHERE TX_ID = :id

Page 59: Application Performance: Database-related Problems

www.luxoft.com

Simple Solution 2 - Problems

● Users can make thousands of transactions per day;● During Christmas holidays users can make thousands of

payments per hour;● Number of transactions continuously grow;● More records in TRANSACTION_LOG table - slower requests.

Page 60: Application Performance: Database-related Problems

www.luxoft.com

Better Solution

● Store balance on yesterday in table;● Update account balance once a day in background;● Store details of each transaction;● Calculate balance dynamically based on value of balance

on yesterday and transactions made today from transaction log.

Page 61: Application Performance: Database-related Problems

www.luxoft.com

Better Solution - Data Model

Table ACCOUNT_BALANCE

Table TRANSACTION_LOG

ACCOUNT_ID BALANCE_DATE BALANCE

TX_ID TX_TYPE TX_STATUS TX_DATE ACCOUNT_ID TX_AMOUNT

Page 62: Application Performance: Database-related Problems

www.luxoft.com

Better Solution - Queries

-- Authorization in new transactionINSERT INTO TRANSACTION_LOG(TX_ID, TX_TYPE, TX_STATUS, TX_DATE, ACCOUNT_ID, TX_AMOUNT) VALUES(:id, :type, 'AUTHORIZED', :date, :account, :amount)

-- Balance check and fulfillment in new transactionUPDATE TRANSACTION_LOG SET TX_STATUS = 'FULFILLED' WHERE TX_ID = :id

-- Executed once a day at midnightUPDATE ACCOUNT_BALANCE SET BALANCE = BALANCE + :transactionLogSum, BALANCE_DATE = :lastTransactionLogDate WHERE ACCOUNT_ID = :account

Page 63: Application Performance: Database-related Problems

www.luxoft.com

Better Solution - Queries

SELECT ACCOUNT_ID, BALANCE_DATE, BALANCE AS CACHED_BALANCE FROM ACCOUNT_BALANCE WHERE ACCOUNT_ID = :account

SELECT ACCOUNT_ID, MAX(TX_DATE) AS LAST_TX_LOG_DATE, SUM(TX_AMOUNT) AS TX_LOG_SUM FROM TRANSACTION_LOG WHERE ACCOUNT_ID = :account AND TX_DATE > :balanceDateGROUP BY ACCOUNT_ID

-- BALANCE = CACHED_BALANCE + TX_LOG_SUM

Page 64: Application Performance: Database-related Problems

www.luxoft.com

Better Solution - Advantages

● No updates during payment operations - no locks● No locks - better throughput● Number of rows in query with SUM operation is limited (1

day)● Constant query execution time

Page 65: Application Performance: Database-related Problems

www.luxoft.com

THANK YOU