Download - Shared Database Concurrency

Shared Database ConcurrencyAivars.Kalvans@(tieto|gmail).com

Who am I

● Tieto Latvia– Payment Card Business Unit

● Lead software architect– Started as a junior developer

● Card Suite - payment card system– 400 institutions, 33 countries

– 100s and 1000s financial transactions per second

Shared memory concurrency

● Concurrent components communicate by altering the contents of shared memory locations– Threads (C++, Java, C#, …)– Processes (POSIX shared memory)

● Buzz– Lock-free– Actors– Functional programming

That's nice, but...

● Most applications use a single database● Often database is the bottleneck● You might not know you have a problem

Shared database concurrency

● Concurrent components communicate by altering shared data in a database

Oracle Real-World Performance

● Most problems come from incorrect use of the database

● DBA and a new hardware: maybe 2x improvement

● Code, design, algorithm: 10x or 100x improvement

Database concurrency

● Pessimistic locking– Concurrent updates might happen

● Optimistic locking– Concurrent updates will not happen

● NoSQL

“Banks don't use transactions”

● Visible tip of the iceberg● Below the water

– Transactions still used within application

– Auditing, Reconciliation, Matching

– Code, Computing power, Manual work

● Money “reserved” for 2 or more weeks.

Database concurrency

● Pessimistic locking– Concurrent updates might happen

● Optimistic locking– Concurrent updates will not happen

● NoSQL● Anything else?

Oracle 101

● Lock table …● Row-level (TX) locks● Locks are held until Commit or Rollback

– Not until Rollback to savepoint

● Writes don't block reads● Reads don't block writes● Writes may block writes

Placing a row-level lock

● Select … for update [nowait]– There is a non-blocking mode!

● Update, Delete● Insert

– Primary key or unique constraint violation

Let's call it “implicit locking”

Other databases

● PostgreSQL: row-level locks● MySQL: it's complicated

– Table-level for MyISAM, MEMORY, MERGE

– Block-level for BDB

– Row-level for InnoDB

● DB2: row-level or page-level● Isolation levels

Which one to use?

● Pessimistic locking● Optimistic locking● Implicit locking

Difference?

Time Pessimistic Optimistic Implicit

t1 Select .. for update Select version, ... Select

t2 Modify data Modify data Modify data

t3 Update ... Update …set version=:next_versionwhere version=:known_version

Update ...

t4 Retry if 0 rows updated

t5 Commit Commit Commit

Pessimistic vs. Optimistic

http://www.orafaq.com/papers/locking.pdfMore challenging than the technology is overcoming resistance from seasoned development professionals who have been using the trusted SELECT… FOR UPDATE for all of their Oracle careers.

These individuals may need to be convinced of the benefits of using optimistic and on large development projects their support will be crucial.

http://www.orafaq.com/papers/locking.pdf

Problems

● Lockout problem– Go to lunch between locking and commit

● Deadlock problem– No difference placing locks by select or update

● Performance– Select+Update vs. Select+Update vs.

Select+Update

Optimistic?

● Shorter locking time– We don't expect concurrent updates, right?

● Wait for lock before update is executed– Locks are queued

Lost updates

● Absolute updates– Setting a new salary

● Relative updates– Adding a positive or a negative number

– Min/Max

How to choose locking

● Implicit locking by default– Relative updates– Last update wins

● Pessimistic locking– Prevent concurrency

● Optimistic locking– ?– Transaction isolation: Serializable

Increasing concurrency

● Reduce the time locks are held● Reduce the need for locks● Fine-grained locks

Warm-up

● Prepared statements● Reduce both Hard and Soft parses

– Create a statement just once

● Batching / bulk operations– Python: .executemany

– JDBC: addBatch

● Faster CPUs and disks– Not more

Warm-up

● Design– Avoid absolute updates

● Inserts instead of updates– Create a change log

– Insert

– Aggregate and update once in a while

Skip indexes

● Select a physical row identifier for subsequent use within transaction– Oracle: ROWID

– PostgreSQL: ctid

Select rowid from table1;

Update table1 set col1=:val1 where rowid=:rowid;

Combining SQL

Select col1, col2 from table1;

Update table1 set col3=:val3;

Update table1 set col3=:val3

returning col1, col2 into :out1, :out2;

Combining SQL

Insert into table1 (col1, col2) values (:val1, :val2);

Insert into table2 (col1, col2) values (:val1, :val2);

Insert all

into table1 (col1, col2) values (:val1, :val2)

into table2 (col1, col2) values (:val1, :val2)

Select * from dual;

Reordering

● Do (b)locking statements last


Insert table2 (col1, col2) values (:val1, :val2);

Insert table2 (col1, col2) values (:val1, :val2);


Combine SQL with a commit

● Do a commit with the last SQL statement– JDBC: setAutoCommit

Insert; Update Commit;

Insert; Update+Commit;

No work between SQL

● Prepare data for all statements before executing the first one– JDBC: setInt, setString before any execute,

executeUpdate

● Not an option for some APIs– Python: .execute(SQL, parameters)

“Partition” rows

● Multiple rows instead of one– Distribute evenly among processes

– Aggregate rows when reading

Id Balance

13 42

Id Balance Partition

13 15 0

13 7 1

13 11 2

13 9 3

Thank you