TQ
dev.com

Blog about software development

Subscribe

Trading durability for performance without NoSQL

07 May 2016 - by 'Maurits van der Schee'

Developers turn to NoSQL solutions whenever they are confronted with a DBMS (write) performance challenge. I think that in most of the cases that is an exceptionally bad idea. By choosing NoSQL you trade the A, C and I from ACID for performance:

  1. Atomicity (risk of writes half succeeding, due to lack of multi-entity transactions)
  2. Consistency (risk of ambiguous data due to denormalization into multiple key/value sets)
  3. Consistency (risk of "messy" data, due to lack of schema and constraints)
  4. Isolation (risk of parallel writes on inconsistent state, due to lack of multi-entity transactions)
  5. Durability (risk of losing data by not flushing to disk, but keeping in memory)

A, C and I are actually the things of ACID that I often don't want to trade. The thing I am most willing to trade is the "D". I don't care that in the very unlikely event of an application or server crash a few seconds of writes are lost in exchange for a ten to hundreds times better performance. Or as they say at MongoDB:

... we think single server durability is overvalued. First, there are many scenarios in which that server loses all its data no matter what. If there is water damage, fire, some hardware problems, etc... no matter how durable the software is, data can be lost. (source)

Fortunately, all big relational databases allow this trade-off to be configured. This post will explain what you need to do in order to configure your database for higher performance, trading durability instead of consistency (as NoSQL does).

MariaDB (previously: MySQL)

The following settings will make your MySQL server perform a lot better (on writes), risking 1 second of data-loss:

innodb_flush_log_at_trx_commit = 0
innodb_flush_log_at_timeout = 1

Or as the MySQL documentation says:

Controls the balance between strict ACID compliance for commit operations, and higher performance that is possible when commit-related I/O operations are rearranged and done in batches. You can achieve better performance by changing the default value, but then you can lose up to a second of transactions in a crash. (source).

Note that the one second timeout is only configurable from version 5.6.6.

PostgreSQL: Non-Durable Settings

PostgreSQL has a similar feature:

synchronous_commit off

The documentation says:

The default, and safe, setting is on. When off, there can be a delay between when success is reported to the client and when the transaction is really guaranteed to be safe against a server crash. (The maximum delay is three times wal_writer_delay.) (source)

It also recommends this feature over other optimizations:

In many scenarios, asynchronous commit provides most of the performance improvement that could be obtained by turning off fsync, but without the risk of data corruption. (source)

It seems this feature is available in all supported PostgreSQL versions.

Microsoft SQL Server: Delayed Durability

In SQL Server 2014 it is even easier to configure such a setting:

ALTER DATABASE dbname SET DELAYED_DURABILITY = FORCED;

But there are several exceptions in which the above setting is not applied:

...some transactions are always fully durable, regardless of database settings or commit settings; for example, system transactions, cross-database transactions, and operations involving FileTable, Change Tracking and Change Data Capture. (source)

Especially the "cross-database transactions" exception may be inconvenient.


PS: Liked this article? Please share it on Facebook, Twitter or LinkedIn.