Dealing With Inconsistent Reads When Using Transactions And Async Tasks In Enterprise Applications | by Lucas Pereyra | May, 2022

A standard challenge and a fast strategy to keep away from it

Picture by Shubham Dhage on Unsplash

Usually, enterprise functions make use of transactional database options to make sure a gaggle of database operations is absolutely dumped into the database itself — from the previous to the final one altogether — stopping these operations from being partially utilized.

Having mentioned that, each learn operation in opposition to the information that’s being modified in a transaction could have this transaction’s modifications mirrored on it so long as it’s carried out after the transaction has been efficiently utilized.

Many enterprise functions must cope with a large amount of visitors, thus resulting in advanced studying/writing eventualities the place concurrency points are frequent to occur.

If a learn operation is carried out on an information merchandise whereas it’s being modified by a transaction concurrently, it can most likely retrieve an outdated model of the information merchandise that doesn’t have the transaction’s modifications nonetheless utilized to it. Each learn operation carried out earlier than the transaction has fully completed shall be more likely to behave this manner.

A particular case of those concurrent studying/writing eventualities raises when utilizing asynchronous activity processing at the side of transactions. Usually when a particular enterprise operation is carried out, a number of asynchronous duties must be carried out.

More often than not, these duties are associated to particular software implementation particulars, thus they are often carried out at any second sooner or later. For that to occur, duties are normally enqueued utilizing a queuing mechanism from which a number of activity runners can take and execute them asynchronously. A standard setup of this atmosphere would appear like this:

The main application runs in a group of specific nodes that have similar infrastructure settings. The main application uses an external mechanism to enqueue tasks which are pending to be executed later. At any moment, a task runner could pick one stack from the queue and execute it. A task runner is no more than a process that will be executed using a dedicated node or set of nodes with their own infrastructure settings. The tasks queue could be implemented using a Redis stack, for example.
An instance of an infrastructure setup that holds three fundamental processes: the primary software, the duty runners and a queueing mechanism

The issue arises when duties are triggered earlier than transactions efficiently end, and people duties carry out studying/writing operations on the identical items of knowledge these transactions is perhaps utilizing.

This state of affairs results in having duties being carried out with outdated variations of knowledge, thus producing inconsistencies alongside the applying. The next diagram makes an attempt to raised depict this case:

If a data item “X” is modified inside of an open transaction, changes won’t be reflected in the database until it is committed. Supposing that after modifying data item “X”, task “A” is enqueued and is executed immediately, every read operation that task “A” performs on data item “X”, won’t retrieve the latest changes. Furthermore, every write operation task “A” performs using data item “X”, will be wrong.
Execution move that reveals how learn operations carried out in activity “A” may very well be inconsistent

Be aware that for this state of affairs to occur, duties that had been enqueued throughout a transaction ought to be instantly taken by a activity runner course of. If it takes longer for the duty runner to start out with a activity, then by the point the duty begins, the transaction could already be completed, and the issue wouldn’t exist. Having an empty queue in the intervening time the duty is enqueued can be a perfect state of affairs the place this challenge is more likely to occur because the activity can be instantly taken by an idle activity runner.

To raised describe the problem itself, a fast non-sophisticated instance shall be offered. Use it as a fast assist to make clear how the issue would possibly appear like in a sensible scenario greater than a well-defined information of the way it truly seems to be like. In observe, this challenge might give rise to extra advanced and hard-to-diagnose eventualities.

Usually, massive computation outcomes are saved in their very own database tables to keep away from repeating the identical computations, repeatedly, each time these outcomes are requested by some software characteristic.

Supposing that an worker’s wage is never modified, that there are literally thousands of staff in an organization, and that the common wage of staff per division is incessantly requested; an asynchronous activity that updates the average_per_department desk each time an worker’s wage is up to date might have been carried out. A fast PHP skeleton for that implementation would appear like the next:

For this, the execution move can be just like the next:

Execution flow of the proposed example. An “UpdateSalariesPerDepartment” task is enqueued right after an employee’s salary has been updated, within a transaction. If the “SalariesPerDepartment” task is executed after the transaction ends, it will calculate the averages using the old employee’s salary instead of the updated one.
Execution move that reveals how the “UpdateSalariesPerDepartment” activity might learn the worker’s salaries inconsistently

As soon as the UpdateSalariesPerDepartment activity has been enqueued, supposing that an idle activity runner is accessible, it will likely be executed instantly, therefore executing the run implementation.

Because of this, this activity will learn inconsistent outdated knowledge and can make the common calculation utilizing the older worker’s wage model. Therefore, the duty received’t actually make any change on the saved averages.

Though this challenge may very well be very tough to diagnose and detect in an actual manufacturing atmosphere, the answer for it appears to be fairly easy: asynchronous duties that must function on the information gadgets which are being modified by a transaction, ought to be triggered as soon as all of the transactional modifications have been dumped into the database.

Since these are asynchronous duties, it wouldn’t matter that their execution is delayed till the transaction has completed.

Furthermore, the primary profit from this may be that of guaranteeing duties are all the time executed with the most recent up-to-date variations of the affected knowledge gadgets.

Trying on the earlier instance, the utilized workaround would appear like this:

Be aware that if the transactional operations are rollbacked, the asynchronous duties are by no means triggered. However, as soon as the worker’s wage modification is efficiently executed and the transaction is dedicated, the common salaries updating activity is enqueued, and the worker notification is distributed. The execution move was modified, and now it seems to be like the next:

Execution flow of the proposed example with the workaround applied. The “UpdateSalariesPerDepartment” task is enqueued after the employee’s salary updating transaction has finished. By doing this, there won’t be any chance that the task reads an old version of an employee’s salary. The employee notification is also sent once the salary has been updated.
Execution move exhibiting how the “UpdateSalariesPerDepartment” activity now will all the time learn and write to the database constantly

Now the UpdateSalariesPerDepartment activity is all the time executed as soon as the workers’ salaries have been up to date, irrespective of what number of idle activity runners there are at any time.

The averages per division would all the time be absolutely in keeping with the workers’ salaries.

More Posts