Developing a database server: software engineer's view

Developing a Database Server: Software Engineer’s ViewLaurynas Biveinis / Percona laurynas.biveinis@{gmail|percona}.com Big Data Strategy 2015 Vilnius

Which database server?

Percona Server

http://www.percona.com/software/percona-server

A drop-in compatible fork of MySQL

An open-source, relational database management system

Approaching 2,000,000 downloads

http://www.percona.com/software/percona-server

A part of MySQL ecosystem

Enabled by GNU General Public License

Forks abound

Healthy and thriving

Lots of politics

The main players, pt 1

The main players, pt 3 Big Web Patches

The ecosystem is fragmented, but is it healthy?

One measure is code flow between the forks

A case of super_read_only

A case of super_read_onlyFacebook patch implemented it first

Facebook contributed it to WebScaleSQL



Percona Server merged it from WebScaleSQL, sent some bugfixes back to WebScaleSQL




Oracle re-implemented it from scratch for the next major MySQL release




Oracle re-implemented it from scratch for the next major MySQL release

MariaDB did not like it

Code is flowing (mostly) everywhere Coopetition

Back to Percona Server

Tracks MySQL closely

Diagnostics and management

Performance and scalability

Why diagnostics and management?

Early Percona Server:

Ad-hoc patch for extra diagnostics by Percona consultants

Get billed-per-hour work done more efficiently

Why (InnoDB) performance and scalability?

In 2010, InnoDB was performing worse on a 4-core machine than on 1-core one

And fixes were not forthcoming at the time

Addressed the need then, built the reputation since

Why not other features?

Feature benefit / feature cost ratio has to be very, very high

Case 1: implement low-hanging fruits

Case 2: implement extremely beneficial features

No rewrites, no refactorings, no code base cleanups

“Why not other features” brings us to lessons learned

Lesson 1: stand on the shoulders of giants

You probably do not need to write a DBMS from scratch

So find a good project to fork

Lesson 2: do not diverge

Do not add a single line of code difference without a very good reason

Unless your engineering team is as big as the upstream one

Improvements such as O(n2) -> O(n log n) algorithms are often not good enough in cold code paths

Plugins are very good

Lesson 3: listen to usersEasier said than done, especially if done right

Listening and then ignoring / downplaying users’ pain

Listening to wrong users

We have the best users! :)

$$$ / €€€ add weight to users’ opinions

Both right and wrong

Lesson 4: Continuous QC

Was not something Percona Server had on Day One

MySQL always had an automated feature/regression testsuite

But 3rd parties did not always add tests for their features

Step 1: require developers to actually run the testsuite

Step 2: Jenkins per-push

Step 3: …

Lesson 4: wrong ways and slightly less wrong ways to do performance

A Performance Graph

0

10000

20000

30000

40000

Product A Product B

A Performance Graph

0

10000

20000

30000

40000

Product A Product B

PRODUCT B IS BETTER !!1!

Same performance graph, different view

0

20000

40000

60000

80000

00:00 00:01 00:02 00:03 00:04 00:05 00:06

Product A Product B

Is Product B still better?

How to provision capacity for B?

What response time guarantee will it give?

Will your automated failover work correctly in the presence of stalls?

0

20000

40000

60000

80000

00:00 00:03 00:06

Engineering low variance > engineering max peak performance

Where does variance come from anyway?

From the query code path requesting resources with variable availability

C, C++, CPU, memory: caches, heap, mutexes, rwlocks

Memory/disk: data on disk, which could be cached

RDBMS: free space on WAL log etc

Client-server and clusters: network roundtrips

Database servers love being in homeostasis

All the required resources for queries readily available

In the presence of unpredictable load

Do not make query threads work for this

Monitor them in background and make them available as needed

In the presence of unpredictable workload

If you want to develop a DBMS:

Find an existing one to fork!

And then do not diverge

Listen to your users

Control quality continuously

Ensure stable performance

Developing a database server: software engineer's view

Software

Transcript of Developing a database server: software engineer's view