Developing a database server: software engineer's view

32
Developing a Database Server: Software Engineer’s View Laurynas Biveinis / Percona laurynas.biveinis@{gmail|percona}.com Big Data Strategy 2015 Vilnius

Transcript of Developing a database server: software engineer's view

Page 1: Developing a database server: software engineer's view

Developing a Database Server: Software Engineer’s ViewLaurynas Biveinis / Percona laurynas.biveinis@{gmail|percona}.com Big Data Strategy 2015 Vilnius

Page 2: Developing a database server: software engineer's view

Which database server?

Percona Server

http://www.percona.com/software/percona-server

A drop-in compatible fork of MySQL

An open-source, relational database management system

Approaching 2,000,000 downloads

Page 3: Developing a database server: software engineer's view

A part of MySQL ecosystem

Enabled by GNU General Public License

Forks abound

Healthy and thriving

Lots of politics

Page 4: Developing a database server: software engineer's view

The main players, pt 1

Page 5: Developing a database server: software engineer's view

The main players, pt 2

Page 6: Developing a database server: software engineer's view

The main players, pt 3 Big Web Patches

Page 7: Developing a database server: software engineer's view

The main players, pt 4

Page 8: Developing a database server: software engineer's view

The main players, pt 5

Page 9: Developing a database server: software engineer's view

The ecosystem is fragmented, but is it healthy?

One measure is code flow between the forks

Page 10: Developing a database server: software engineer's view

A case of super_read_only

Page 11: Developing a database server: software engineer's view

A case of super_read_onlyFacebook patch implemented it first

Facebook contributed it to WebScaleSQL

Page 12: Developing a database server: software engineer's view

A case of super_read_onlyFacebook patch implemented it first

Facebook contributed it to WebScaleSQL

Percona Server merged it from WebScaleSQL, sent some bugfixes back to WebScaleSQL

Page 13: Developing a database server: software engineer's view

A case of super_read_onlyFacebook patch implemented it first

Facebook contributed it to WebScaleSQL

Percona Server merged it from WebScaleSQL, sent some bugfixes back to WebScaleSQL

Oracle re-implemented it from scratch for the next major MySQL release

Page 14: Developing a database server: software engineer's view

A case of super_read_onlyFacebook patch implemented it first

Facebook contributed it to WebScaleSQL

Percona Server merged it from WebScaleSQL, sent some bugfixes back to WebScaleSQL

Oracle re-implemented it from scratch for the next major MySQL release

MariaDB did not like it

Page 15: Developing a database server: software engineer's view

Code is flowing (mostly) everywhere Coopetition

Page 16: Developing a database server: software engineer's view

Back to Percona Server

Tracks MySQL closely

Diagnostics and management

Performance and scalability

Page 17: Developing a database server: software engineer's view

Why diagnostics and management?

Early Percona Server:

Ad-hoc patch for extra diagnostics by Percona consultants

Get billed-per-hour work done more efficiently

Page 18: Developing a database server: software engineer's view

Why (InnoDB) performance and scalability?

In 2010, InnoDB was performing worse on a 4-core machine than on 1-core one

And fixes were not forthcoming at the time

Addressed the need then, built the reputation since

Page 19: Developing a database server: software engineer's view

Why not other features?

Feature benefit / feature cost ratio has to be very, very high

Case 1: implement low-hanging fruits

Case 2: implement extremely beneficial features

No rewrites, no refactorings, no code base cleanups

Page 20: Developing a database server: software engineer's view

“Why not other features” brings us to lessons learned

Page 21: Developing a database server: software engineer's view

Lesson 1: stand on the shoulders of giants

You probably do not need to write a DBMS from scratch

So find a good project to fork

Page 22: Developing a database server: software engineer's view

Lesson 2: do not diverge

Do not add a single line of code difference without a very good reason

Unless your engineering team is as big as the upstream one

Improvements such as O(n2) -> O(n log n) algorithms are often not good enough in cold code paths

Plugins are very good

Page 23: Developing a database server: software engineer's view

Lesson 3: listen to usersEasier said than done, especially if done right

Listening and then ignoring / downplaying users’ pain

Listening to wrong users

We have the best users! :)

$$$ / €€€ add weight to users’ opinions

Both right and wrong

Page 24: Developing a database server: software engineer's view

Lesson 4: Continuous QC

Was not something Percona Server had on Day One

MySQL always had an automated feature/regression testsuite

But 3rd parties did not always add tests for their features

Step 1: require developers to actually run the testsuite

Step 2: Jenkins per-push

Step 3: …

Page 25: Developing a database server: software engineer's view

Lesson 4: wrong ways and slightly less wrong ways to do performance

Page 26: Developing a database server: software engineer's view

A Performance Graph

0

10000

20000

30000

40000

Product A Product B

Page 27: Developing a database server: software engineer's view

A Performance Graph

0

10000

20000

30000

40000

Product A Product B

PRODUCT B IS BETTER !!1!

Page 28: Developing a database server: software engineer's view

Same performance graph, different view

0

20000

40000

60000

80000

00:00 00:01 00:02 00:03 00:04 00:05 00:06

Product A Product B

Page 29: Developing a database server: software engineer's view

Is Product B still better?

How to provision capacity for B?

What response time guarantee will it give?

Will your automated failover work correctly in the presence of stalls?

0

20000

40000

60000

80000

00:00 00:03 00:06

Page 30: Developing a database server: software engineer's view

Engineering low variance > engineering max peak performance

Where does variance come from anyway?

From the query code path requesting resources with variable availability

C, C++, CPU, memory: caches, heap, mutexes, rwlocks

Memory/disk: data on disk, which could be cached

RDBMS: free space on WAL log etc

Client-server and clusters: network roundtrips

Page 31: Developing a database server: software engineer's view

Database servers love being in homeostasis

All the required resources for queries readily available

In the presence of unpredictable load

Do not make query threads work for this

Monitor them in background and make them available as needed

In the presence of unpredictable workload

Page 32: Developing a database server: software engineer's view

If you want to develop a DBMS:

Find an existing one to fork!

And then do not diverge

Listen to your users

Control quality continuously

Ensure stable performance