An Alternative to Outbox Pattern for Microservices Architecture | by dongfu ye | Apr, 2022

Picture by JJ Ying on Unsplash

This text proposes an alternate sample to Outbox: a 2-phase message. It’s not primarily based on a message queue however primarily based on github.com/dtm-labs/dtm, a extremely accessible distributed transaction framework.

An inter-bank switch is a typical distributed transaction situation, the place A must switch cash throughout a financial institution to B. The balances of A and B aren’t in the identical financial institution so they don’t seem to be saved in a single database. This switch is often crossing microservices additionally.

The primary drawback is that the switch should replace two techniques concurrently — the increment of A’ stability and the decrement of B’s stability. That is referred to as well-known “twin writes”. A course of crash between the 2 updates leaves your complete system in an inconsistent state.

This “twin writes” drawback will be solved by the Outbox sample. The precept of the Outbox sample will be discovered right here: Transactional OutBox

First, let’s take a look at tips on how to accomplish the above switch job utilizing the brand new sample. The next codes are in Go, different languages like C#, PHP will be discovered right here: dtm SDKs

msg := dtmcli.NewMsg(DtmServer, gid).
Add(busi.Busi+"/TransIn", &TransReqAmount: 30)
err := msg.DoAndSubmitDB(busi.Busi+"/QueryPrepared", db, func(tx *sql.Tx) error
return AdjustBalance(tx, busi.TransOutUID, -req.Quantity)
)

Within the above codes:

  • First new a DTM msg world transaction, passing the dtm server deal with and the worldwide transaction-id
  • Add to the msg a department enterprise, which is the switch operation TransIn, along with the info that must be handed to this service, the quantity 30$
  • Then name msg‘s DoAndSubmitDB. This operate will make sure the atomic execution of each the enterprise and submission of msg, both each succeeded, or each failed. There are three parameters for this operate:
  1. The check-back URL shall be defined later
  2. DB, is the database object for the enterprise
  3. The enterprise operate, right here in our instance is to debit 30$ for A’s stability

What is going to occur when the method crashed instantly after the success of decrement for A’s stability? After a timeout, DTM will name the check-back URL to question whether or not the decrement is profitable or unsuccessful. We will accomplish the check-back service by pasting the next code:

app.GET(BusiAPI+"/QueryPrepared", dtmutil.WrapHandler2(func(c *gin.Context) interface 
return MustBarrierFromGin(c).QueryPrepared(db)
))

After writing these two items of codes, a 2-phase message is completed, a lot simpler to make use of than Outbox.

You possibly can run the above instance by working the next instructions.

Run DTM

git clone https://github.com/dtm-labs/dtm && cd dtm
go run predominant.go

Run Instance

git clone https://github.com/dtm-labs/dtm-examples && cd dtm-examples
go run predominant.go http_msg_doAndCommit

How does DoAndSubmitDB make sure the atomicity of profitable enterprise execution and msg submission? Please see the next timing diagram.

Usually, the 5 steps within the timing diagram will full usually, and the worldwide transaction completes. There’s something wanted to elucidate right here: the dedication of msg is finished in two phases, first Put together, then Submit.

After DTM receives the Put together request, it doesn’t name the department transaction, however waits for the following Submit. Solely when it receives the Submit request, does it begin the department name and at last full the worldwide transaction.

In a distributed system, all types of downtime and community exceptions should be thought of, so let’s check out what can occur.

A very powerful aim we need to obtain is that each the enterprise execution and the message submission compose an atomic operation. So let’s first have a look at what is going to occur if there’s a downtime failure after the enterprise execution and earlier than the message submission, and the way the brand new sample will make sure the atomicity.

Let’s check out the timing diagram on this case.

On this case, DTM will ballot the messages that’s solely Ready however not Submitted after a sure timeout and name the check-back service specified by the message to question whether or not the enterprise execution is profitable.

This check-back service goes contained in the message desk and queries whether or not the native transaction for enterprise has been dedicated.

  • Dedicated: Returns success, dtm submits the worldwide transaction and proceeds to the subsequent sub-transaction name
  • Rolled again: Failure is returned, dtm terminates the worldwide transaction and no extra sub-transaction calls are made
  • In progress: This check-back will watch for the ultimate end result after which proceeds to the earlier dedicated/rollbacked case
  • Not Began: This check-back will insert knowledge to make sure that the native transaction for enterprise finally fails

Let’s check out the timing diagram of a neighborhood transaction being rolled again.

If the method is crashed instantly after the dtm receives the Put together name and earlier than the transaction dedication, the native database will detect the method’s disconnection and roll again the native transaction mechanically.

Subsequently, dtm polls for the worldwide transactions which have timed out, that’s solely Ready however not Submitted and checked again. The check-back service finds that the native transaction has been rollbacked and returns the end result to dtm. dtm receives the end result indicating rollbacked, after which marks the worldwide transaction as failed, and at last ends the worldwide transaction.

The Outbox sample may make sure the eventual consistency of the info. So far as Outbox sample is used, the work required contains

  • Executing the native enterprise logic within the native transaction, inserting the messages into the message desk, and committing them finally.
  • Writing polling duties to take messages from the native message desk and ship them to the message queue. As a substitute of periodically executing SQL to ballot, this step could use one other method Log-based Change Data Capture.
  • Consuming messages.

In contrast with Outbox, 2-phase message has the next benefits.

  • No have to be taught or keep any message queues
  • No polling duties to deal with
  • No have to eat messages

2-phase messages solely wants DTM, which is far simpler to be taught or to keep up than message queues. All expertise concerned are operate calls and providers calls, that are acquainted issues to all builders.

  • The uncovered interfaces of 2-phase messages are fully impartial of the queue and are solely associated to the precise enterprise and repair calls, making it extra developer-friendly
  • 2-phase messages should not have to think about the message stacking and different failures, as a result of 2-phase messages rely solely on dtm. Builders can consider dtm as being the identical as some other bizarre stateless service within the system, relying solely on the storage behind it, Mysql/Redis.
  • The message queue is asynchronous, whereas 2-phase messages assist each asynchronous and synchronous. The default behaviour is asynchronous, and you may watch for the downstream service to finish synchronously simply by setting msg.WaitResult=true.
  • 2-phase messages additionally assist specifying a number of downstream providers on the similar time

Software of 2-Section Message

2-phase messages can considerably cut back the problem of the eventual consistency resolution and have been extensively used, listed here are two typical functions.

  • flash-sale system: this structure can simply carry tens of hundreds of order requests on a single machine, and be sure that the variety of stock deducted and the variety of orders is precisely matched
  • cache consistency: this structure can simply make sure the consistency of DB and cache by means of a 2-phase message, which is significantly better than queue or subscription bin-log resolution

An instance of utilizing Redis, Mongo storage engine together with 2-phase messages will be present in dtm-examples

The check-back service seems within the earlier timing diagram, in addition to within the interface. This check-back design firstly existed in RocketMQ, and the implementation is left to builders to deal with manually. Within the 2-phase messages, it’s dealt with mechanically by copy-and-paste code. So what’s the precept of computerized processing?

To carry out a check-back, we firstly create a separate desk within the enterprise database occasion the place the gid(world transaction id) is saved. Gid is written to this desk when the enterprise transaction is processed.

After we test again with the gid, if we discover gid within the desk, then it means the native transaction has been dedicated, so we are able to return to dtm the end result that the native transaction has been dedicated.

After we test again with the gid, if we don’t discover gid within the desk, then it means the native transaction has not been dedicated. There are three doable outcomes:

  1. The transaction continues to be in progress.
  2. The transaction has been rolled again.
  3. The transaction has not began.

I’ve searched a variety of details about RocketMQ’s check-back, however haven’t discovered an error-free resolution. Most options are that if the gid will not be discovered, then do nothing and watch for the subsequent check-back within the subsequent 10 seconds. If the check-back has lasted 2 minutes or longer and nonetheless can not discover the gid, then the native transaction is taken into account rollbacked.

There are issues within the following instances.

  • Within the excessive case, a database failure (corresponding to a course of pause or disk jam) could happen, lasting longer than 2 minutes, and at last, the info is dedicated. However RocketMQ assumes the transaction is rolled again, and cancels the worldwide transaction, leaving the info in an inconsistent state.
  • If a neighborhood transaction, has been rollbacked, however the check-back service, inside two minutes, will continually polling each 10 seconds, inflicting pointless load on the server.

These issues are fully solved by dtm’s 2-phase message resolution. It really works as follows.

  1. When a neighborhood transaction is processed, gid is inserted into the desk dtm_barrier.barrier with an insert purpose of COMMITTED. Desk dtm_barrier.barrier has a singular index on gid.
  2. When checking again, the 2-phase message doesn’t straight question whether or not gid exists, however as an alternative insert ignore a row with the identical gid, along with the rationale ROLLBACKED. Right now, if there may be already a document with gid within the desk, then the brand new insert operation shall be ignored, in any other case the row shall be inserted.
  3. Question the data within the desk with gid, if the rationale of the document is COMMITTED, then the native transaction has been dedicated; if the rationale of the document is ROLLBACKED, then the native transaction has been rolled again or shall be rolled again.

So how do 2-phase message distinguish between in-progress and rolled again messages? The trick lies within the knowledge inserted through the check-back. If the database transaction continues to be in progress on the time of the check-back, then the insert operation shall be blocked by the in-progress transaction, as a result of the insert operation in check-back will watch for the row lock held by the in-progress transaction. If the insert operation returns usually, then the native transaction within the database, will need to have ended.

2-phase messages can change not solely Outbox but in addition the conventional message sample. Should you name Submit straight, then it’s just like the conventional message sample however gives a extra versatile and easy interface.

Suppose an utility situation the place there’s a button on the UI to take part in an exercise that grants everlasting entry to 2 eBooks. On this case, the server-side will be dealt with like this:

msg := dtmcli.NewMsg(DtmServer, gid).
Add(busi.Busi+"/AuthBook", &ReqUID: 1, BookID: 5).
Add(busi.Busi+"/AuthBook", &ReqUID: 1, BookID: 6)
err := msg.Submit()

This strategy additionally gives an asynchronous interface with out counting on a message queue.

The two-phase message proposed on this article has a easy and stylish interface that brings a extra elegant sample than Outbox.

Welcome to go to github.com/dtm-labs/dtm. It’s a devoted challenge to make distributed transactions in microservices simpler. It helps a number of languages, and a number of patterns like a 2-phase message, Saga, Tcc, and Xa.

More Posts