Database Transactions: Everything That Can Go Wrong When Using Them

Note: This is an excerpt from an unedited version of my book MariaDB for Developers.
To this point, we have understood the concept of atomicity—either all operations succeed or none do. What can go wrong? It seems like we are covered. And we are. Until we introduce concurrency in our system. MariaDB is one of the most highly performant database systems and tries to parallelize processing to increase throughput.
\
Parallelizing means that MariaDB can execute transactions from different sessions at the same time by interleaving operations from different transactions instead of waiting for one to finish before starting the next. Each transaction has its own sequence of operations, but MariaDB executes them in overlapping order. Figure 8-2 shows two transactions (A and B) and multiple database operations interleaved through time.

This interleaving allows MariaDB to use CPU and I/O resources more efficiently than without parallelism. This, however, opens the door to subtle problems when the parallel transactions read and write overlapping data. Let’s study some of these problems known as concurrency phenomena.
Dirty Reads
Friday afternoon, and we’ve got a winner! Our to-do application—which by chapter 6 became more of a project management tool than a to-do app—is so central to the business that prizes are given to users who excelled at reporting bugs or helping its development. Our to-do application allows the HR team to grant prizes to users, and this use case involves reducing the quantity of the awarded prize in the prizes table.
\
Janet and Moe, both from HR, are using our to-do app at the same time. Janet is about to grant today’s prize (named “Bagelers” in our database), while Moe is viewing a dashboard that shows an overview of the prize inventory. Jane selects the winner and the prize, and clicks on “Grant prize.” Our to-do app starts a new transaction that decreases the quantity for Bagelers from 8 to 7. 
\
At that moment, Moe refreshes the dashboard and sees that there are 7 Bagelers. However, the system crashes, and since the transaction was never committed, the new quantity is not written to disk. Jane gets an error, but Moe doesn’t. To him, there are 7 Bagelers. He is seeing incorrect data. This is called a dirty read. Figure 8-3 shows an example of the sequence of operations that lead to a dirty read at time t3.

Non-Repeatable Reads
A similar situation can occur when a transaction reads a value twice, but such value is modified between the reads by another transaction. In this case, the second read would obtain a different value. This phenomenon is called a non-repeatable read and can lead to incorrect results if the values are used for other calculations in the same transaction. Figure 8-4 shows a non-repeatable read at time t5.

Phantom Reads
If the write operation in the previous example implies inserting rows, we get what’s called a phantom read phenomenon. Figure 8-5 shows a phantom read at time t5.

 
				


