




一, AOF(Append Only File)持久化


       AOF是通过保存发送到服务器端的命令来保存数据库状态,类似关系数据库的redo log。Redis Client SDK,比如Jedis,与Redis Server通信的协议是 RESP (REdis Serialization Protocol), AOF本质就是把RESP的内容直接以Append的方式保存到文件里。以SET KEY VALUE命令为例,追加到AOF文件的内容是:   




          struct redisServer{

                   // ...

                  // AOF缓冲区

                  sds aof_buf;


         Redis Server进程本质上是一个单线程的IO多路复用(epoll机制)的事件循环 ,在这个循环中接收客户端的命令请求,执行数据操作,发送命令回复;时间事件负责执行serverCron需要定时运行的函数。服务器端在接收到命令请求后,会把存放到aof_buf缓冲区中,在服务器每次结束一次事件循环之前,调用flushAppenOnlyFile函数,根据参数配置将aof_buf缓冲区的内容写入和同步AOF文件里,伪代码如下:

     Def eventLoop():

        While True:

           // 接收客户端的命令请求,执行数据操作,发送命令回复


           //时间事件 执行serverCron函数


           // 根据参数配置将aof_buf缓冲区的内容写入和同步AOF文件






# The default is "everysec"

        # no: don't fsync, just let the OS flush the data when it wants. Faster.

# always: fsync after every write to the append only log. Slow, Safest.

# everysec: fsync only one time every second. Compromise. 






How it works

a)Log rewriting uses the same copy-on-write trick already in use for snapshotting. This is how it works:

b)Redis forks, so now we have a child and a parent process.

c)The child starts writing the new AOF in a temporary file.

d)The parent accumulates all the new changes in an in-memory buffer (but at the same time it writes the new changes in the old append-only file, so if the rewriting fails, we are safe).

e)When the child is done rewriting the file, the parent gets a signal, and appends the in-memory buffer at the end of the file generated by the child.

f)Profit! Now Redis atomically renames the old file into the new one, and starts appending new data into the new file.




二, RDB持久化





2.How it works

a)Whenever Redis needs to dump the dataset to disk, this is what happens:

b)Redis forks. We now have a child and a parent process.

c)The child starts to write the dataset to a temporary RDB file.

d)When the child is done writing the new RDB file, it replaces the old one.

e)This method allows Redis to benefit from copy-on-write semantics


#  automatically save the dataset every N seconds if there are at least M changes in the dataset

save 900 1             # after 900 sec (15 min) if at least 1 key changed

save 300 10            #after 300 sec (5 min) if at least 10 keys changed

save 60 10000          #after 60 sec if at least 10000 keys changed



RDB advantages

a)RDB is a very compact single-file point-in-time representation of your Redis data. RDB files are perfect for backups. For instance you may want to archive your RDB files every hour for the latest 24 hours, and to save an RDB snapshot every day for 30 days. This allows you to easily restore different versions of the data set in case of disasters.

b)RDB is very good for disaster recovery, being a single compact file that can be transferred to far data centers, or onto Amazon S3 (possibly encrypted).

c)RDB maximizes Redis performances since the only work the Redis parent process needs to do in order to persist is forking a child that will do all the rest. The parent instance will never perform disk I/O or alike.

d)RDB allows faster restarts with big datasets compared to AOF.

RDB disadvantages

a)RDB is NOT good if you need to minimize the chance of data loss in case Redis stops working (for example after a power outage). You can configure different save points where an RDB is produced (for instance after at least five minutes and 100 writes against the data set, but you can have multiple save points). However you'll usually create an RDB snapshot every five minutes or more, so in case of Redis stopping working without a correct shutdown for any reason you should be prepared to lose the latest minutes of data.

b)RDB needs to fork() often in order to persist on disk using a child process. Fork() can be time consuming if the dataset is big, and may result in Redis to stop serving clients for some millisecond or even for one second if the dataset is very big and the CPU performance not great. AOF also needs to fork() but you can tune how often you want to rewrite your logs without any trade-off on durability.

AOF advantages

a)Using AOF Redis is much more durable: you can have different fsync policies: no fsync at all, fsync every second, fsync at every query. With the default policy of fsync every second write performances are still great (fsync is performed using a background thread and the main thread will try hard to perform writes when no fsync is in progress.) but you can only lose one second worth of writes.

b)The AOF log is an append only log, so there are no seeks, nor corruption problems if there is a power outage. Even if the log ends with an half-written command for some reason (disk full or other reasons) the redis-check-aof tool is able to fix it easily.

c)Redis is able to automatically rewrite the AOF in background when it gets too big. The rewrite is completely safe as while Redis continues appending to the old file, a completely new one is produced with the minimal set of operations needed to create the current data set, and once this second file is ready Redis switches the two and starts appending to the new one.

d)AOF contains a log of all the operations one after the other in an easy to understand and parse format. You can even easily export an AOF file. For instance even if you flushed everything for an error using a FLUSHALL command, if no rewrite of the log was performed in the meantime you can still save your data set just stopping the server, removing the latest command, and restarting Redis again.

AOF disadvantages

a)AOF files are usually bigger than the equivalent RDB files for the same dataset.

b)AOF can be slower than RDB depending on the exact fsync policy. In general with fsync set to every second performance is still very high, and with fsync disabled it should be exactly as fast as RDB even under high load. Still RDB is able to provide more guarantees about the maximum latency even in the case of an huge write load.

Ok, so what should I use?

The general indication is that you should use both persistence methods if you want a degree of data safety comparable to what PostgreSQL can provide you.

If you care a lot about your data, but still can live with a few minutes of data loss in case of disasters, you can simply use RDB alone.

There are many users using AOF alone, but we discourage it since to have an RDB snapshot from time to time is a great idea for doing database backups, for faster restarts, and in the event of bugs in the AOF engine.





