Remove AOF optimization to skip expired keys.

Basically we cannot be sure that if the key is expired while writing the
AOF, the main thread will surely find the key expired. There are
possible race conditions like the moment at which the "now" is sampled,
and the fact that time may jump backward.

Think about the following:

SET a 5
EXPIRE a 1

AOF rewrite starts after about 1 second. The child process finds the key
expired, while in the main thread instead an INCR command is called
against the key "a" immediately after a fork, and the scheduler was
faster to give execution time to the main thread, so "a" is yet not
expired.

The main thread will generate an INCR a command to the AOF log that will
be appended to the rewritten AOF file, but that INCR command will target
a non existin "a" key, so a new non volatile key "a" will be created.

Two observations:

A) In theory by computing "now" before the fork, we should be sure that
if a key is expired at that time, it will be expired later when the
main thread will try to access to such key. However this does not take
into account the fact that the computer time may jump backward.

B) Technically we may still make the process safe by using a monotonic
time source.

However there were other similar related bugs, and in general the new
"vision" is that Redis persistence files should represent the memory
state without trying to be too smart: this makes the design more
consistent, bugs less likely to arise from complex interactions, and in
the end what is to fix is the Redis expire process to have less expired
keys in RAM.

Thanks to Oran Agra and Guy Benoish for writing me an email outlining
this problem, after they conducted a Redis 5 code review.
This commit is contained in:
antirez 2018-06-19 15:43:06 +02:00
parent 44571088d8
commit ba92b517b8

View File

@ -1247,9 +1247,6 @@ int rewriteAppendOnlyFileRio(rio *aof) {
expiretime = getExpire(db,&key); expiretime = getExpire(db,&key);
/* If this key is already expired skip it */
if (expiretime != -1 && expiretime < now) continue;
/* Save the key and associated value */ /* Save the key and associated value */
if (o->type == OBJ_STRING) { if (o->type == OBJ_STRING) {
/* Emit a SET command */ /* Emit a SET command */