- 浏览: 523683 次
- 性别:
- 来自: 杭州
-
文章分类
- 全部博客 (114)
- C基础 (1)
- C指针 (0)
- C语言库函数相关 (1)
- Linux (2)
- Linux网络编程 (1)
- PostgreSQL (0)
- Redis (2)
- Java Web (2)
- JAVA基础 (35)
- Ubuntu (8)
- Android (2)
- MySQL (3)
- 日志 (1)
- 书虫 (1)
- 数据结构 (0)
- 算法 (0)
- 开发工具 (1)
- 转载 (13)
- 英语 (18)
- tomcat启动脚本分析 (3)
- Oracle基础 (4)
- tomcat源码分析 (3)
- tomcat (1)
- Java相关 (1)
- Oracle基本原理--Oracle体系结构 (0)
- Oracle基本原理--表 (0)
- Oracle基本原理--索引 (0)
- Oracle基本原理--事务 (0)
- Oracle开发--SQL (1)
- Oracle基本原理--PL/SQL (0)
- Oracle基本原理--常用函数 (0)
- Oralce管理--用户及权限管理 (0)
- Oracle管理--安装调试 (0)
- Oracle管理--备份恢复 (0)
- Oralce管理--数据迁移 (0)
- Oracle管理--闪回 (0)
- Oracle管理--故障处理 (0)
- Oracle优化原理--统计信息 (0)
- Oracle优化原理--执行计划 (0)
- Oracle优化原理--诊断工具 (0)
- Oracle优化原理--深入理解表 (0)
- Oracle优化原理--深入理解索引 (0)
- Oracle优化原理--表连接原理 (0)
- Java--OOP (0)
- Java--异常 (0)
- Java--泛型 (0)
- Java--集合 (0)
- Java--IO (0)
- Java--枚举类型 (0)
- Java--注释 (0)
- Java--多线程 (0)
- Java--XML (0)
- Java--JDBC (3)
- Servlet (0)
- JSP (0)
- JSTL (0)
- 设计模式 (0)
- DAO与MVC (0)
- Javascript (2)
- Ajax (0)
- JQuery (0)
- HTML/CSS (0)
- 前端相关 (1)
- HTTP (0)
- TCP/IP (0)
- GO基础 (0)
最新评论
-
jsonmong:
推荐一个开发平台,采用的是插件化的设计思想,效果很不错的。ht ...
构建Java Web开发环境 -
wxm198427:
首先表示辛苦了!我想问个问题:我的是windows 7 x64 ...
Oracle 11g R2 for Win7旗舰版(64位)的安装步骤 -
握着橄榄枝的人:
我之前按照你的update mysql.user set pa ...
Windows7下MySQL5.5.20免安装版的配置 -
confident_f:
安装了32的客户端后,用plsql导入导出表有问题,生成不了d ...
Oracle 11g R2 for Win7旗舰版(64位)的安装步骤 -
confident_f:
安装数据库的时候第9步卡住了 是怎么回事呢?
Oracle 11g R2 for Win7旗舰版(64位)的安装步骤
As you already probably know Redis is not a plain key-value store, actually it is a data structures server, supporting different kind of values. That is, you can't just set strings as values of keys. All the following data types are supported as values:
Binary-safe strings.
Lists of binary-safe strings.
Sets of binary-safe strings, that are collection of unique unsorted elements. You can think at this as a Ruby hash where all the keys are set to the 'true' value.
Sorted sets, similar to Sets but where every element is associated to a floating number score. The elements are taken sorted by score. You can think of this as Ruby hashes where the key is the element and the value is the score, but where elements are always taken in order without requiring a sorting operation.
It's not always trivial to grasp how this data types work and what to use in order to solve a given problem from the command reference, so this document is a crash course to Redis data types and their most used patterns.
For all the examples we'll use the redis-cli utility, that's a simple but handy command line utility to issue commands against the Redis server.
Redis keys
Redis keys are binary safe, this means that you can use any binary sequence as a key, from a string like "foo" to the content of a JPEG file. The empty string is also a valid key.
A few other rules about keys:
Too long keys are not a good idea, for instance a key of 1024 bytes is not a good idea not only memory-wise, but also because the lookup of the key in the dataset may require several costly key-comparisons.
Too short keys are often not a good idea. There is little point in writing "u:1000:pwd" as key if you can write instead "user:1000:password", the latter is more readable and the added space is little compared to the space used by the key object itself and the value object. However it is not possible to deny that short keys will consume a bit less memory.
Try to stick with a schema. For instance "object-type:id:field" can be a nice idea, like in "user:1000:password". I like to use dots for multi-words fields, like in "comment:1234:reply.to".
The string type
This is the simplest Redis type. If you use only this type, Redis will be something like a memcached server with persistence.
Let's play a bit with the string type:
As you can see using the SET command and the GET command is trivial to set values to strings and have this strings returned back.
Values can be strings (including binary data) of every kind, for instance you can store a jpeg image inside a key. A value can't be bigger than 1 Gigabyte.
Even if strings are the basic values of Redis, there are interesting operations you can perform against them. For instance one is atomic increment:
The INCR command parses the string value as an integer, increments it by one, and finally sets the obtained value as the new string value. There are other similar commands like INCRBY, DECR and DECRBY. Actually internally it's always the same command, acting in a slightly different way.
What means that INCR is atomic? That even multiple clients issuing INCR against the same key will never incur into a race condition. For instance it can never happen that client 1 read "10", client 2 read "10" at the same time, both increment to 11, and set the new value of 11. The final value will always be of 12 and the read-increment-set operation is performed while all the other clients are not executing a command at the same time.
Another interesting operation on string is the GETSET command, that does just what its name suggests: Set a key to a new value, returning the old value, as result. Why this is useful? Example: you have a system that increments a Redis key using the INCR command every time your web site receives a new visit. You want to collect this information one time every hour, without losing a single key. You can GETSET the key assigning it the new value of "0" and reading the old value back.
The List type
To explain the List data type it's better to start with a little bit of theory, as the term List is often used in an improper way by information technology folks. For instance "Python Lists" are not what the name may suggest (Linked Lists), they are actually Arrays (the same data type is called Array in Ruby actually).
From a very general point of view a List is just a sequence of ordered elements: 10,20,1,2,3 is a list. But the properties of a List implemented using an Array are very different from the properties of a List implemented using a Linked List.
Redis lists are implemented via Linked Lists. This means that even if you have millions of elements inside a list, the operation of adding a new element in the head or in the tail of the list is performed in constant time. Adding a new element with the LPUSH command to the head of a ten elements list is the same speed as adding an element to the head of a 10 million elements list.
What's the downside? Accessing an element by index is very fast in lists implemented with an Array and not so fast in lists implemented by linked lists.
Redis Lists are implemented with linked lists because for a database system it is crucial to be able to add elements to a very long list in a very fast way. Another strong advantage is, as you'll see in a moment, that Redis Lists can be taken at constant length in constant time.
First steps with Redis lists
The LPUSH command adds a new element into a list, on the left (at the head), while the RPUSH command adds a new element into a list, on the right (at the tail). Finally the LRANGE command extracts ranges of elements from lists:
Note that LRANGE takes two indexes, the first and the last element of the range to return. Both the indexes can be negative to tell Redis to start to count from the end, so -1 is the last element, -2 is the penultimate element of the list, and so forth.
As you can guess from the example above, lists can be used, for instance, in order to implement a chat system. Another use is as queues in order to route messages between different processes. But the key point is that you can use Redis lists every time you require to access data in the same order they are added. This will not require any SQL ORDER BY operation, will be very fast, and will scale to millions of elements even with a toy Linux box.
For instance in ranking systems like that used by social news site reddit.com you can add every new submitted link into a List, and with LRANGE it's possible to paginate results in a trivial way.
In a blog engine implementation you can have a list for every post, where to push blog comments, and so forth.
Pushing IDs instead of the actual data in Redis lists
In the above example we pushed our "objects" (simply messages in the example) directly inside the Redis list, but this is often not the way to go, as objects can be referenced in multiple times: in a list to preserve their chronological order, in a Set to remember they are about a specific category, in another list but only if this object matches some kind of requisite, and so forth.
Let's return back to the reddit.com example. A more credible pattern for adding submitted links (news) to the list is the following:
We obtained a unique incremental ID for our news object just incrementing a key, then used this ID to create the object setting a key for every field in the object. Finally the ID of the new object was pushed on the submitted.news list.
This is just the start. Check the command reference and read about all the other list related commands. You can remove elements, rotate lists, get and set elements by index, and of course retrieve the length of the list with LLEN.
Redis Sets
Redis Sets are unordered collection of binary-safe strings. The SADD command adds a new element to a set. It's also possible to do a number of other operations against sets like testing if a given element already exists, performing the intersection, union or difference between multiple sets and so forth. An example is worth 1000 words:
I added three elements to my set and told Redis to return back all the elements. As you can see they are not sorted.
Now let's check if a given element exists:
"3" is a member of the set, while "30" is not. Sets are very good for expressing relations between objects. For instance we can easily use Redis Sets in order to implement tags.
A simple way to model this is to have, for every object you want to tag, a Set with all the IDs of the tags associated with the object, and for every tag that exists, a Set of of all the objects tagged with this tag.
For instance if our news ID 1000 is tagged with tag 1,2,5 and 77, we can specify the following two Sets:
To get all the tags for a given object is trivial:
But there are other non trivial operations that are still easy to implement using the right Redis commands. For instance we may want the list of all the objects having as tags 1, 2, 5, and 77 at the same time. We can do this using the SINTER that performs the intersection between different sets. So in order to reach our goal we can just use:
Look at the command reference to discover other Set related commands, there are a bunch of interesting ones. Also make sure to check the SORT command as both Redis Sets and Lists are sortable.
A digression: How to get unique identifiers for strings
In our tags example we showed tag IDs without mention of how the IDs can be obtained. Basically for every tag added to the system, you need a unique identifier. You also want to be sure that there are no race conditions if multiple clients are trying to add the same tag at the same time. Also, if a tag already exists, you want its ID returned, otherwise a new unique ID should be created and associated to the tag.
Redis 1.4 will add the Hash type. With it it will be trivial to associate strings with unique IDs, but how to do this today with the current commands exported by Redis in a reliable way?
Our first attempt (that is broken) can be the following. Let's suppose we want to get a unique ID for the tag "redis":
In order to make this algorithm binary safe (they are just tags but think to utf8, spaces and so forth) we start performing the SHA1 sum of the tag. SHA1(redis) = b840fc02d524045429941cc15f59e41cb7be6c52.
Let's check if this tag is already associated with a unique ID with the command GET tag:b840fc02d524045429941cc15f59e41cb7be6c52:id.
If the above GET returns an ID, return it back to the user. We already have the unique ID.
Otherwise... create a new unique ID with INCR next.tag.id (assume it returned 123456).
Finally associate this new ID to our tag with SET tag:b840fc02d524045429941cc15f59e41cb7be6c52:id 123456 and return the new ID to the caller.
Nice. Or better.. broken! What about if two clients perform these commands at the same time trying to get the unique ID for the tag "redis"? If the timing is right they'll both get nil from the GET operation, will both increment the next.tag.id key and will set two times the key. One of the two clients will return the wrong ID to the caller. To fix the algorithm is not hard fortunately, and this is the sane version:
In order to make this algorithm binary safe (they are just tags but think to utf8, spaces and so forth) we start performing the SHA1 sum of the tag. SHA1(redis) = b840fc02d524045429941cc15f59e41cb7be6c52.
Let's check if this tag is already associated with a unique ID with the command GET tag:b840fc02d524045429941cc15f59e41cb7be6c52:id.
If the above GET returns an ID, return it back to the user. We already have the unique ID.
Otherwise... create a new unique ID with INCR next.tag.id (assume it returned 123456).
Finally associate this new ID to our tag with SETNX tag:b840fc02d524045429941cc15f59e41cb7be6c52:id 123456. By using SETNX if a different client was faster than this one the key wil not be setted. Not only, SETNX returns 1 if the key is set, 0 otherwise. So... let's add a final step to our computation.
If SETNX returned 1 (We set the key) return 123456 to the caller, it's our tag ID, otherwise perform GET tag:b840fc02d524045429941cc15f59e41cb7be6c52:id and return the value to the caller.
Sorted sets
Sets are a very handy data type, but... they are a bit too unsorted in order to fit well for a number of problems ;) This is why Redis 1.2 introduced Sorted Sets. They are very similar to Sets, collections of binary-safe strings, but this time with an associated score, and an operation similar to the List LRANGE operation to return items in order, but working against Sorted Sets, that is, the ZRANGE command.
Basically Sorted Sets are in some way the Redis equivalent of Indexes in the SQL world. For instance in our reddit.com example above there was no mention about how to generate the actual home page with news raked by user votes and time. We'll see how sorted sets can fix this problem, but it's better to start with something simpler, illustrating the basic working of this advanced data type. Let's add a few selected hackers with their year of birth as "score".
$ redis-cli zadd hackers 1940 "Alan Kay"
(integer) 1
$ redis-cli zadd hackers 1953 "Richard Stallman"
(integer) 1
$ redis-cli zadd hackers 1965 "Yukihiro Matsumoto"
(integer) 1
$ redis-cli zadd hackers 1916 "Claude Shannon"
(integer) 1
$ redis-cli zadd hackers 1969 "Linus Torvalds"
(integer) 1
$ redis-cli zadd hackers 1912 "Alan Turing"
(integer) 1
For sorted sets it's a joke to return these hackers sorted by their birth year because actually they are already sorted. Sorted sets are implemented via a dual-ported data structure containing both a skip list and a hash table, so every time we add an element Redis performs an O(log(N)) operation. That's good, but when we ask for sorted elements Redis does not have to do any work at all, it's already all sorted:
$ redis-cli zrange hackers 0 -1
1. Alan Turing
2. Claude Shannon
3. Alan Kay
4. Richard Stallman
5. Yukihiro Matsumoto
6. Linus Torvalds
Didn't know that Linus was younger than Yukihiro btw ;)
Anyway I want to order these elements the other way around, using ZREVRANGE instead of ZRANGE this time:
$ redis-cli zrevrange hackers 0 -1
1. Linus Torvalds
2. Yukihiro Matsumoto
3. Richard Stallman
4. Alan Kay
5. Claude Shannon
6. Alan Turing
A very important note, ZSets have just a "default" ordering but you are still free to call the SORT command against sorted sets to get a different ordering (but this time the server will waste CPU). An alternative for having multiple orders is to add every element in multiple sorted sets at the same time.
Operating on ranges
Sorted sets are more powerful than this. They can operate on ranges. For instance let's try to get all the individuals that were born up to the 1950. We use the ZRANGEBYSCORE command to do it:
$ redis-cli zrangebyscore hackers -inf 1950
1. Alan Turing
2. Claude Shannon
3. Alan Kay
We asked Redis to return all the elements with a score between negative infinity and 1950 (both extremes are included).
It's also possible to remove ranges of elements. For instance let's remove all the hackers born between 1940 and 1960 from the sorted set:
$ redis-cli zremrangebyscore hackers 1940 1960
(integer) 2
ZREMRANGEBYSCORE is not the best command name, but it can be very useful, and returns the number of removed elements.
Back to the Reddit example
For the last time, back to the Reddit example. Now we have a decent plan to populate a sorted set in order to generate the home page. A sorted set can contain all the news that are not older than a few days (we remove old entries from time to time using ZREMRANGEBYSCORE). A background job gets all the elements from this sorted set, get the user votes and the time of the news, and compute the score to populate the reddit.home.page sorted set with the news IDs and associated scores. To show the home page we just have to perform a blazingly fast call to ZRANGE.
From time to time we'll remove too old news from the reddit.home.page sorted set as well in order for our system to work always against a limited set of news.
Updating the scores of a sorted set
Just a final note before wrapping up this tutorial. Sorted sets scores can be updated at any time. Just calling again ZADD against an element already included in the sorted set will update its score (and position) in O(log(N)), so sorted sets are suitable even when there are tons of updates.
This tutorial is in no way complete, this is just the basics to get started with Redis, read the command reference to discover a lot more.
Thanks for reading. Salvatore.
Binary-safe strings.
Lists of binary-safe strings.
Sets of binary-safe strings, that are collection of unique unsorted elements. You can think at this as a Ruby hash where all the keys are set to the 'true' value.
Sorted sets, similar to Sets but where every element is associated to a floating number score. The elements are taken sorted by score. You can think of this as Ruby hashes where the key is the element and the value is the score, but where elements are always taken in order without requiring a sorting operation.
It's not always trivial to grasp how this data types work and what to use in order to solve a given problem from the command reference, so this document is a crash course to Redis data types and their most used patterns.
For all the examples we'll use the redis-cli utility, that's a simple but handy command line utility to issue commands against the Redis server.
Redis keys
Redis keys are binary safe, this means that you can use any binary sequence as a key, from a string like "foo" to the content of a JPEG file. The empty string is also a valid key.
A few other rules about keys:
Too long keys are not a good idea, for instance a key of 1024 bytes is not a good idea not only memory-wise, but also because the lookup of the key in the dataset may require several costly key-comparisons.
Too short keys are often not a good idea. There is little point in writing "u:1000:pwd" as key if you can write instead "user:1000:password", the latter is more readable and the added space is little compared to the space used by the key object itself and the value object. However it is not possible to deny that short keys will consume a bit less memory.
Try to stick with a schema. For instance "object-type:id:field" can be a nice idea, like in "user:1000:password". I like to use dots for multi-words fields, like in "comment:1234:reply.to".
The string type
This is the simplest Redis type. If you use only this type, Redis will be something like a memcached server with persistence.
Let's play a bit with the string type:
redis 127.0.0.1:6379> set mykey "Hello" OK redis 127.0.0.1:6379> get mykey "Hello"
As you can see using the SET command and the GET command is trivial to set values to strings and have this strings returned back.
Values can be strings (including binary data) of every kind, for instance you can store a jpeg image inside a key. A value can't be bigger than 1 Gigabyte.
Even if strings are the basic values of Redis, there are interesting operations you can perform against them. For instance one is atomic increment:
redis 127.0.0.1:6379> set counter 100 OK redis 127.0.0.1:6379> get counter "100" redis 127.0.0.1:6379> incr counter (integer) 101 redis 127.0.0.1:6379> incr counter (integer) 102 redis 127.0.0.1:6379> incrby counter 10 (integer) 112 redis 127.0.0.1:6379> get counter "112"
The INCR command parses the string value as an integer, increments it by one, and finally sets the obtained value as the new string value. There are other similar commands like INCRBY, DECR and DECRBY. Actually internally it's always the same command, acting in a slightly different way.
What means that INCR is atomic? That even multiple clients issuing INCR against the same key will never incur into a race condition. For instance it can never happen that client 1 read "10", client 2 read "10" at the same time, both increment to 11, and set the new value of 11. The final value will always be of 12 and the read-increment-set operation is performed while all the other clients are not executing a command at the same time.
Another interesting operation on string is the GETSET command, that does just what its name suggests: Set a key to a new value, returning the old value, as result. Why this is useful? Example: you have a system that increments a Redis key using the INCR command every time your web site receives a new visit. You want to collect this information one time every hour, without losing a single key. You can GETSET the key assigning it the new value of "0" and reading the old value back.
The List type
To explain the List data type it's better to start with a little bit of theory, as the term List is often used in an improper way by information technology folks. For instance "Python Lists" are not what the name may suggest (Linked Lists), they are actually Arrays (the same data type is called Array in Ruby actually).
From a very general point of view a List is just a sequence of ordered elements: 10,20,1,2,3 is a list. But the properties of a List implemented using an Array are very different from the properties of a List implemented using a Linked List.
Redis lists are implemented via Linked Lists. This means that even if you have millions of elements inside a list, the operation of adding a new element in the head or in the tail of the list is performed in constant time. Adding a new element with the LPUSH command to the head of a ten elements list is the same speed as adding an element to the head of a 10 million elements list.
What's the downside? Accessing an element by index is very fast in lists implemented with an Array and not so fast in lists implemented by linked lists.
Redis Lists are implemented with linked lists because for a database system it is crucial to be able to add elements to a very long list in a very fast way. Another strong advantage is, as you'll see in a moment, that Redis Lists can be taken at constant length in constant time.
First steps with Redis lists
The LPUSH command adds a new element into a list, on the left (at the head), while the RPUSH command adds a new element into a list, on the right (at the tail). Finally the LRANGE command extracts ranges of elements from lists:
redis 127.0.0.1:6379> rpush messages "Hello how are you ?" (integer) 1 redis 127.0.0.1:6379> rpush messages "Fine thinks.I'm having fun witn Redis" (integer) 2 redis 127.0.0.1:6379> rpush messages "I should look into this NOSQL thing ASAP" (integer) 3 redis 127.0.0.1:6379> lrange messages 0 2 1) "Hello how are you ?" 2) "Fine thinks.I'm having fun witn Redis" 3) "I should look into this NOSQL thing ASAP"
Note that LRANGE takes two indexes, the first and the last element of the range to return. Both the indexes can be negative to tell Redis to start to count from the end, so -1 is the last element, -2 is the penultimate element of the list, and so forth.
As you can guess from the example above, lists can be used, for instance, in order to implement a chat system. Another use is as queues in order to route messages between different processes. But the key point is that you can use Redis lists every time you require to access data in the same order they are added. This will not require any SQL ORDER BY operation, will be very fast, and will scale to millions of elements even with a toy Linux box.
For instance in ranking systems like that used by social news site reddit.com you can add every new submitted link into a List, and with LRANGE it's possible to paginate results in a trivial way.
In a blog engine implementation you can have a list for every post, where to push blog comments, and so forth.
Pushing IDs instead of the actual data in Redis lists
In the above example we pushed our "objects" (simply messages in the example) directly inside the Redis list, but this is often not the way to go, as objects can be referenced in multiple times: in a list to preserve their chronological order, in a Set to remember they are about a specific category, in another list but only if this object matches some kind of requisite, and so forth.
Let's return back to the reddit.com example. A more credible pattern for adding submitted links (news) to the list is the following:
redis 127.0.0.1:6379> incr next.news.id (integer) 1 redis 127.0.0.1:6379> set news:1:title "Redis is simple" OK redis 127.0.0.1:6379> set news:1:url "http://code.google/com/p/redis" OK redis 127.0.0.1:6379> lpush submitted.news 1 (integer) 1 redis 127.0.0.1:6379> lrange submitted.news 0 1 1) "1" redis 127.0.0.1:6379> get news:1:title "Redis is simple" redis 127.0.0.1:6379> get news:1:url "http://code.google/com/p/redis"
We obtained a unique incremental ID for our news object just incrementing a key, then used this ID to create the object setting a key for every field in the object. Finally the ID of the new object was pushed on the submitted.news list.
This is just the start. Check the command reference and read about all the other list related commands. You can remove elements, rotate lists, get and set elements by index, and of course retrieve the length of the list with LLEN.
Redis Sets
Redis Sets are unordered collection of binary-safe strings. The SADD command adds a new element to a set. It's also possible to do a number of other operations against sets like testing if a given element already exists, performing the intersection, union or difference between multiple sets and so forth. An example is worth 1000 words:
redis 127.0.0.1:6379> sadd myset 1 (integer) 1 redis 127.0.0.1:6379> sadd myset 2 (integer) 1 redis 127.0.0.1:6379> sadd myset 3 (integer) 1 redis 127.0.0.1:6379> smembers myset 1) "3" 2) "1" 3) "2"
I added three elements to my set and told Redis to return back all the elements. As you can see they are not sorted.
Now let's check if a given element exists:
redis 127.0.0.1:6379> sismember myset 3 (integer) 1 redis 127.0.0.1:6379> sismember myset 30 (integer) 0
"3" is a member of the set, while "30" is not. Sets are very good for expressing relations between objects. For instance we can easily use Redis Sets in order to implement tags.
A simple way to model this is to have, for every object you want to tag, a Set with all the IDs of the tags associated with the object, and for every tag that exists, a Set of of all the objects tagged with this tag.
For instance if our news ID 1000 is tagged with tag 1,2,5 and 77, we can specify the following two Sets:
$ redis-cli sadd news:1000:tags 1 (integer) 1 $ redis-cli sadd news:1000:tags 2 (integer) 1 $ redis-cli sadd news:1000:tags 5 (integer) 1 $ redis-cli sadd news:1000:tags 77 (integer) 1 $ redis-cli sadd tag:1:objects 1000 (integer) 1 $ redis-cli sadd tag:2:objects 1000 (integer) 1 $ redis-cli sadd tag:5:objects 1000 (integer) 1 $ redis-cli sadd tag:77:objects 1000 (integer) 1
To get all the tags for a given object is trivial:
redis 127.0.0.1:6379> smembers news:1000:tags 1) "1" 2) "2" 3) "5" 4) "77"
But there are other non trivial operations that are still easy to implement using the right Redis commands. For instance we may want the list of all the objects having as tags 1, 2, 5, and 77 at the same time. We can do this using the SINTER that performs the intersection between different sets. So in order to reach our goal we can just use:
redis 127.0.0.1:6379> sinter tags:1:objects tags:2:objects tags:5:objects tags:77:objects 1) "1000"
Look at the command reference to discover other Set related commands, there are a bunch of interesting ones. Also make sure to check the SORT command as both Redis Sets and Lists are sortable.
A digression: How to get unique identifiers for strings
In our tags example we showed tag IDs without mention of how the IDs can be obtained. Basically for every tag added to the system, you need a unique identifier. You also want to be sure that there are no race conditions if multiple clients are trying to add the same tag at the same time. Also, if a tag already exists, you want its ID returned, otherwise a new unique ID should be created and associated to the tag.
Redis 1.4 will add the Hash type. With it it will be trivial to associate strings with unique IDs, but how to do this today with the current commands exported by Redis in a reliable way?
Our first attempt (that is broken) can be the following. Let's suppose we want to get a unique ID for the tag "redis":
In order to make this algorithm binary safe (they are just tags but think to utf8, spaces and so forth) we start performing the SHA1 sum of the tag. SHA1(redis) = b840fc02d524045429941cc15f59e41cb7be6c52.
Let's check if this tag is already associated with a unique ID with the command GET tag:b840fc02d524045429941cc15f59e41cb7be6c52:id.
If the above GET returns an ID, return it back to the user. We already have the unique ID.
Otherwise... create a new unique ID with INCR next.tag.id (assume it returned 123456).
Finally associate this new ID to our tag with SET tag:b840fc02d524045429941cc15f59e41cb7be6c52:id 123456 and return the new ID to the caller.
Nice. Or better.. broken! What about if two clients perform these commands at the same time trying to get the unique ID for the tag "redis"? If the timing is right they'll both get nil from the GET operation, will both increment the next.tag.id key and will set two times the key. One of the two clients will return the wrong ID to the caller. To fix the algorithm is not hard fortunately, and this is the sane version:
In order to make this algorithm binary safe (they are just tags but think to utf8, spaces and so forth) we start performing the SHA1 sum of the tag. SHA1(redis) = b840fc02d524045429941cc15f59e41cb7be6c52.
Let's check if this tag is already associated with a unique ID with the command GET tag:b840fc02d524045429941cc15f59e41cb7be6c52:id.
If the above GET returns an ID, return it back to the user. We already have the unique ID.
Otherwise... create a new unique ID with INCR next.tag.id (assume it returned 123456).
Finally associate this new ID to our tag with SETNX tag:b840fc02d524045429941cc15f59e41cb7be6c52:id 123456. By using SETNX if a different client was faster than this one the key wil not be setted. Not only, SETNX returns 1 if the key is set, 0 otherwise. So... let's add a final step to our computation.
If SETNX returned 1 (We set the key) return 123456 to the caller, it's our tag ID, otherwise perform GET tag:b840fc02d524045429941cc15f59e41cb7be6c52:id and return the value to the caller.
Sorted sets
Sets are a very handy data type, but... they are a bit too unsorted in order to fit well for a number of problems ;) This is why Redis 1.2 introduced Sorted Sets. They are very similar to Sets, collections of binary-safe strings, but this time with an associated score, and an operation similar to the List LRANGE operation to return items in order, but working against Sorted Sets, that is, the ZRANGE command.
Basically Sorted Sets are in some way the Redis equivalent of Indexes in the SQL world. For instance in our reddit.com example above there was no mention about how to generate the actual home page with news raked by user votes and time. We'll see how sorted sets can fix this problem, but it's better to start with something simpler, illustrating the basic working of this advanced data type. Let's add a few selected hackers with their year of birth as "score".
$ redis-cli zadd hackers 1940 "Alan Kay"
(integer) 1
$ redis-cli zadd hackers 1953 "Richard Stallman"
(integer) 1
$ redis-cli zadd hackers 1965 "Yukihiro Matsumoto"
(integer) 1
$ redis-cli zadd hackers 1916 "Claude Shannon"
(integer) 1
$ redis-cli zadd hackers 1969 "Linus Torvalds"
(integer) 1
$ redis-cli zadd hackers 1912 "Alan Turing"
(integer) 1
For sorted sets it's a joke to return these hackers sorted by their birth year because actually they are already sorted. Sorted sets are implemented via a dual-ported data structure containing both a skip list and a hash table, so every time we add an element Redis performs an O(log(N)) operation. That's good, but when we ask for sorted elements Redis does not have to do any work at all, it's already all sorted:
$ redis-cli zrange hackers 0 -1
1. Alan Turing
2. Claude Shannon
3. Alan Kay
4. Richard Stallman
5. Yukihiro Matsumoto
6. Linus Torvalds
Didn't know that Linus was younger than Yukihiro btw ;)
Anyway I want to order these elements the other way around, using ZREVRANGE instead of ZRANGE this time:
$ redis-cli zrevrange hackers 0 -1
1. Linus Torvalds
2. Yukihiro Matsumoto
3. Richard Stallman
4. Alan Kay
5. Claude Shannon
6. Alan Turing
A very important note, ZSets have just a "default" ordering but you are still free to call the SORT command against sorted sets to get a different ordering (but this time the server will waste CPU). An alternative for having multiple orders is to add every element in multiple sorted sets at the same time.
Operating on ranges
Sorted sets are more powerful than this. They can operate on ranges. For instance let's try to get all the individuals that were born up to the 1950. We use the ZRANGEBYSCORE command to do it:
$ redis-cli zrangebyscore hackers -inf 1950
1. Alan Turing
2. Claude Shannon
3. Alan Kay
We asked Redis to return all the elements with a score between negative infinity and 1950 (both extremes are included).
It's also possible to remove ranges of elements. For instance let's remove all the hackers born between 1940 and 1960 from the sorted set:
$ redis-cli zremrangebyscore hackers 1940 1960
(integer) 2
ZREMRANGEBYSCORE is not the best command name, but it can be very useful, and returns the number of removed elements.
Back to the Reddit example
For the last time, back to the Reddit example. Now we have a decent plan to populate a sorted set in order to generate the home page. A sorted set can contain all the news that are not older than a few days (we remove old entries from time to time using ZREMRANGEBYSCORE). A background job gets all the elements from this sorted set, get the user votes and the time of the news, and compute the score to populate the reddit.home.page sorted set with the news IDs and associated scores. To show the home page we just have to perform a blazingly fast call to ZRANGE.
From time to time we'll remove too old news from the reddit.home.page sorted set as well in order for our system to work always against a limited set of news.
Updating the scores of a sorted set
Just a final note before wrapping up this tutorial. Sorted sets scores can be updated at any time. Just calling again ZADD against an element already included in the sorted set will update its score (and position) in O(log(N)), so sorted sets are suitable even when there are tons of updates.
This tutorial is in no way complete, this is just the basics to get started with Redis, read the command reference to discover a lot more.
Thanks for reading. Salvatore.
相关推荐
T型三电平+SVPWM的下垂控制与双闭环中点电位平衡控制.pdf
STM32真实企业级项目:锅炉控制器源码、原理图与PCB图.pdf
STM32F103 Modbus主站源码:正常使役,支持多从机功能码通信及从机寄存器写入.pdf
Simulink永磁同步直驱风机PMSG一次调频离散模型:含虚拟惯性与下垂控制,可扩展至光伏储能研究.pdf
VSG仿真、并网与离网运行仿真、预同期并网控制及虚拟同步机逆变器仿真.pdf
VIC水文模型全程视频教学指导.pdf
vrep_coppeliasim+matlab机器人轨迹控制仿真:利用matlab读取轨迹并控制机械臂在墙上绘图的详细学习示例.pdf
2000-2022年上市公司行业异质性数据(技术密集型、劳动密集型、资本密集型)(含原始数据和处理代码) 1、时间:2000-2022年 2、指标:股票代码、年份、股票简称、统计日期、行业名称、行业代码、成立日期、上市日期、所在省份、所在城市、上市状态、保留两位行业代码、保留一位行业代码、高科技为1,非高科技为0、重污染为1,非重污染为0、制造业为1,非制造业为0、劳动密集型为1,资本密集型为2,技术密集型为3 3、来源:csmar 4、根据2012年中国证监会行业划分是否高科技、是否重污染、是否制造业、是否劳动密集型、资本密集型、技术密集型。 5、内容:包括原始数据、处理代码和计算结果
TMS320F28335电机控制程序:BLDC、PMSM无感有感及异步VF程序源代码与开发资料大全.pdf
tc275、s12x、s32k144基于CANoe的UDS诊断数据库CDD文件及CAPL Boot上位机、下位机程序移植说明文档.pdf
STM32系列通信透传技术:以太网、串口、CAN透传及OBD协议解析.pdf
STM32开发:IIR带阻滤波器设计与实现.pdf
UG后处理:CNC西门子828D后处理与西门子后处理工厂实战自用.pdf
MYSQL深入学习总结.pdf
Stewart六自由度平台反解算法 C#.pdf
1、文件说明: Centos8操作系统vim-ale-3.3.0-1.el8.rpm以及相关依赖,全打包为一个tar.gz压缩包 2、安装指令: #Step1、解压 tar -zxvf vim-ale-3.3.0-1.el8.tar.gz #Step2、进入解压后的目录,执行安装 sudo rpm -ivh *.rpm
tc275、s12x和s32k144的Boot程序及UDS故障诊断与Bootloader移植的Python自制上位机源码.pdf
SSA-CNN-LSTM时间序列预测(Matlab)_ 麻雀算法优化卷积长短期记忆网络.pdf
UI篇:C#工控上位机Chart控件实现与展示.pdf
SRM12-8开关磁阻电机,功率2200w,额定转速3450rpm.pdf