Understanding Redis Parent-child Node Inconsistency | by Dwen | Apr, 2022

Redis repeated studying of expired knowledge downside

Picture by Surface on Unsplash

I just lately encountered an issue in a manufacturing atmosphere the place the Key of the Redis father or mother node had expired, however the youngster nodes had been nonetheless studying the expired knowledge.

In the present day, I’d prefer to share what I discovered from this downside.

We all know that almost all enterprise situations are learn extra and written much less. With the intention to make the most of this characteristic and enhance the throughput of the Redis cluster system, the parent-child structure and read-write separation are normally used.

As proven within the determine above.

  • Mum or dad node: answerable for the write operation of the enterprise.
  • Youngster node: synchronizes the info of the Mum or dad node in real-time and offers learn functionality.

With the intention to enhance throughput, the structure of 1 father or mother and a number of kids are adopted to distribute the learn strain of the enterprise to a number of servers.

The above scheme appears affordable, however in reality, there could also be some hidden risks!

The excessive efficiency of Redis is principally because of pure reminiscence operations, however the price of reminiscence storage media is simply too excessive, so the storage of knowledge has sure constraints.

The frequent answer is to set the expiration time. For some knowledge that’s not used very steadily, it is going to be deleted usually to enhance the utilization of sources.

To delete expired knowledge, Redis offers two methods:

  1. Lazy delete.

Also referred to as passive deletion, when the info expires, it is not going to be deleted instantly. As an alternative, it waits till there’s a request for entry, checks the info, and deletes the info if it expires.

Benefits: No want to begin further scanning threads individually, lowering the consumption of CPU sources.

Disadvantages: A considerable amount of expired knowledge stays in reminiscence, which must be actively triggered, checked, and deleted, in any other case it would at all times occupy reminiscence sources.

2. Periodically delete.

Each now and again, the default is 100ms, Redis will randomly choose a sure variety of keys, test whether or not they’re expired, and delete the expired knowledge.

It’s possible you’ll be asking since Redis has an expired knowledge deletion coverage, why does it nonetheless pull expired knowledge?

This begins with parent-child synchronization. Let’s have a look at the circulate chart first.

When the consumer writes knowledge to the father or mother library and units the expiration time, the info will likely be synchronized to the kid library in an asynchronous method.

1. If the primary database is learn right now and the info has expired, the lazy deletion of the primary database will play a job, and the delete operation will likely be actively triggered, and the consumer is not going to get the expired knowledge.

2. Nevertheless, if the kid library is learn, it’s potential to get expired knowledge.

There are two causes.

Motive one:

It’s associated to the model of Redis. Earlier than Redis 3.2, the learn youngster library is not going to decide whether or not the info is expired, so it could return expired knowledge.

answer:

Improve the model of Redis, no less than 3.2, learn from the library, if the info has expired, it would filter and return a null worth.

concentrate:

Though the info synchronized right now has expired, based mostly on the precept of whoever produces and maintains, the kid library is not going to actively delete the synchronized knowledge, and it must depend on the important thing delete command synchronized by the father or mother node.

Motive two:

It’s associated to the way in which of setting the expiration time. We typically use EXPIREand PEXPIRE, which implies that the ttltime will likely be prolonged from the second the command is executed. Relies upon closely on when the begin time is counted.

  • EXPIRE : the unit is seconds
  • PEXPIRE: the unit is milliseconds

As proven within the determine above, briefly describe the next course of:

The primary library writes an information with an expiration time at t1, and the info is legitimate till t3 .

Resulting from community causes or the execution effectivity of the cache server, the instructions from the library will not be executed instantly, and won’t be executed till t2, and the validity interval of the info will likely be delayed till t5.

If the consumer accesses the kid library right now and finds that the info continues to be throughout the validity interval, it may be used usually.

Answer:

You should utilize the opposite two instructions of Redis, EXPIREATand PEXPIREAT, that are comparatively easy, indicating that the expiration time is a particular time limit. Avoids a dependency on when the beginning time is counted.

EXPIREAT: the unit is seconds
PEXPIREAT: the unit is milliseconds

EXPIREAT and PEXPIREAT set the time level, so the clocks of the father or mother and youngster nodes are required to be constant, and the clocks have to be synchronized with the NTP time server.

In parent-child synchronization, along with the truth that the read-child library might pull outdated knowledge, it could additionally encounter knowledge consistency issues.

Clarify, what’s parent-child knowledge inconsistency? It implies that the worth learn by the consumer from the library is inconsistent with the worth learn in the primary library!

As the image exhibits:

  • The consumer writes to the primary library, the worth is 100.
  • The father or mother then syncs the worth 100 to the kid.
  • Subsequent, the consumer accesses the primary library and updates the worth to 200.
  • For the reason that parent-child synchronization is carried out asynchronously, there’s a sure delay. If the most recent knowledge has not been synchronized to the kid library, the kid library is not going to learn the most recent worth.

There are two foremost the reason why youngster library synchronization is behind:

1. The community transmission between the father or mother and youngster servers could also be delayed

2. The kid library has obtained the command from the primary library. Since it’s executed in a single thread, some time-consuming instructions (reminiscent of pipeline batch processing) are being processed, and can’t be executed synchronously in time.

Answer:

1. The father or mother and youngster servers ought to be deployed in the identical pc room as a lot as potential, and the community between the servers ought to be stored in good situation.

2. Monitor the synchronization progress between the father or mother and youngster databases, and use the information replication command to view the progress data of the father or mother database receiving write instructions (parent_repl_offset), and the progress data of replicating write instructions from the kid database (child_repl_offset).

$ parent_repl_offset - child_repl_offset# Get the replication progress distinction between the kid library and the father or mother library

Or we will develop a monitoring program that usually pulls the progress data of the father or mother and youngster servers and calculates the progress distinction.

If it exceeds the brink we set, the consumer will likely be notified to disconnect the kid library and entry the father or mother library to scale back knowledge inconsistency to a sure extent.

After the synchronization progress catches up, we resume the learn operation between the consumer and the kid node.

More Posts