- 浏览: 622043 次
- 性别:
- 来自: 杭州
- 全部博客 (228)
- io (15)
- cluster (16)
- linux (7)
- js (23)
- bizarrerie (46)
- groovy (1)
- thread (1)
- jsp (8)
- static (4)
- cache (3)
- protocol (2)
- ruby (11)
- hibernate (6)
- svn (1)
- python (8)
- spring (19)
- gma (1)
- architecture (4)
- search (15)
- db (3)
- ibatis (1)
- html5 (1)
- iptables (1)
- server (5)
- nginx (4)
- scala (1)
- DNS (1)
- jPlayer (1)
- Subversion 版本控制 (1)
- velocity (1)
- html (1)
- ppt poi (1)
- java (1)
- bizarrerie spring security (1)
自定义过滤器时,不能再使用<sec:authorize url="">问题 -
webuploader用java实现上传 -
姚小呵 写道如何接收server返回的参数呢?例如你返回的是“ ...
webuploader用java实现上传 -
如何接收server返回的参数呢?例如你返回的是“1”,上传的 ...
webuploader用java实现上传 -
你好,我想用jplayer做一个简单的播放器,但是因为对js不 ...
jplayer 实战
Horizontal Database Partitioning with Spring and Hibernate
- 博客分类:
- cluster
Horizontal Database Partitioning with Spring and Hibernate
关键字: Database
About a year ago we decided to scale our database horizontally - that is, partition it. We had many millions of users in the database, and we were contemplating allowing a lot more user-generated content on our site, as well as collecting much more data on what our users were doing. We had been burned many times by the vertical-scaling strategy ("buy a bigger box") - it's harder and harder to get the money for the next bigger box, you can only get one or two big boxes at a time, everything ends up on that box ("it's the only box powerful enough"), and when it crashes the entire world goes down. So we decided to partition horizontally with commodity hardware.
A MySql consultant specializing in scalability recommended that we partition horizontally based on user: a user and all her data (profile, user-generated content, etc) would be held in a particular partition. A global user database (GLUD) would be key to this array of databases: GLUD would store each user's primary key and the partition ID that the user resided in.
一个用户统一数据库应该是这组数据库的关键。GLUD将存储每个用户的主键和用户所在数据库区域的partition ID
So we went to work. Our initial idea was to create a Hibernate session factory for each partition. Let's say we have two user databases, user1 and user2. Then we'd have two session factories, one for each database. The services that used those databases (eg, the ProfileService) would have one instance created for each database. Profile1Service would connect to profile1Dao, which would use user1SessionFactory. Repeat for N partitions. Calls to the service would encounter a Spring AOP interceptor that would grab the user's identifier, call the GLUD to determine which partition the user's data was in, and then route the call to the correct instantance of the ProfileService.
我们开始实现。我们最初的想法是为每个分区分别创建一个hibernate session factory。我们有2个用户数据库,user1,user2。接下来我们有2个session factories,分别指向2个数据库。Profile1Service连接profile1Dao,连接user1SessionFactory。几个区域就重复几次。调用这个service将遇到Spring AOP 拦截器获得用户的ID,查询GLUD数据库来判断用户数据在哪个区块,然后路由ProfileService调用到正确的数据库。
We implemented a prototype of this, and it worked ok. Then we came across two ideas. First was a blog by Interface21's Mark Fisher where he introduced the AbstractRoutingDataSource. The second was Hibernate Shards. Mark's scheme would have us create only one ProfileService, one ProfileDao, and one UserSessionFactory, and have the datasource be aware of the multiple user databases. Hibernate Shards was a project that just was released that worked similarly to our first idea, creating a separate session factory instance for each database.
我们实现了这个原型并且它工作的还不错。接着我们遇到了2个想法。第一个是Interface21's Mark 的blog中提到的AbstractRoutingDataSource 。第二个是Hibernate Shards。 Mark's的方案使我们只需要创建一个ProfileService,一个ProfileDao,一个UserSessionFactory,并且使数据库知道多个用户表。Hibernate Shards是一个类似我们第一个想法的项目,为每一个数据库创建独立的session factory。
We really wanted to use Shards, rather than write our own partitioning system. But Shards had just been released as a beta. In the end we decided not to use Shards for several reasons: we watched for several weeks, but there was very little activity in the Shards source code repository. We didn't feel safe betting our core infrastructure on a project that was so new and uncertain. Second, the many-session-factory strategy is inherently unscalable: you need a new Hibernate session factory for each new database partition. If you become MySpace-like successful, you'd need a hundred session factories.
我们确实很希望使用Shards,而不是我们自己去实现一个分区系统。但是Shards只是一个beta版本。最终我们没有使用Shards,因为我们观察了一段时间,发现这个项目不太活跃,我们没有安全感,把我们的核心基础架构在这样新并且不确定的项目上。其次,多session-factory策略天生就不可扩展,你需要为每个数据库分区创建Hibernate session factory,如果你变为类似MySpace一样的成功,你需要数百个session factories。
Given all the literature talking about session factories being resource intensive, we weren't comfortable with that thought (this was also the rap on our initial stab at partitioning, above). Finally, looking at the initial Shards docs, it wasn't clear how one would integrate it and configure it with Spring. Spring's LocalSessionFactoryBean wouldn't work. I didn't relish the idea of digging into Spring's transaction infrastructure to build a ShardsSessionFactoryBean that properly integrated with the transaction management in Spring. So we decided to go with the routing-datasource method.
根据文献提到的session factories 消耗资源,我们认为这不是一个好的想法。最终查找最初的Shards 文档,它没有清楚的说明如果集成Shards以及在spring中配置它。Spring的LocalSessionFactoryBean无法工作。我没有深入的探索spring的事务基础结构来创建一个ShardsSessionFactoryBean来合适的集成事务管理。所以我们决定使用Routing datasource方法。
I'll take you through how we set this up, and then what we see as the pros and cons.
First is the GLUD database. This database contains the master_user table, which has the primary key and email address of all the users in all the partitions. In truth, it contains all the uniquely constrained atttibutes of a user, as it's the only place where a unique database constraint can be applied to the column, but for this explanation let's assume
the only unique field (other than PK) is email.
Given a user's email address, the master_user table can be used to locate the user's primary key. Another table is the partition_map. This contains a mapping of a hash of the user's primary key (PK) to a partition id. So once you have a user's PK, you hash it, then look up the partition in the partition_map. The hash function we used was just the last three digits of the PK, so that we allocated 1000 virtual partition. The number of physical partitions can be any number from 1 to 1000 in this scheme. For example, you could map paritions 000-499 to user database 1, and 500-999 to user database 2 if you only used two partitions (or you could go even/odd, or whatever). The point is now that once you have the user's PK or email, you can determine the partition id of the database that has her data.
我们通过master_user表email找到这个用户对应的primary key。另一个表是partition_map,这个表包含了一个hash的user primary key到partition id的映射。所以一旦你有一个用户的pk,对pk做hash操作,然后在partition_map表中查找用户的分区。 我们使用的Hash函数就是PK 的后三位,所以我们可以分配1000个虚拟分区。按照这个方案,物理分区可以是1到1000中的任意数字。如果你只有2个分区,你可以映射000-499结尾的user pk至database1,500-999至用户 database2。现在一旦你有用户的pk id或者email,你可以找到它的数据分区的database的id。
So who is in charge of doing this partition location calculation? We wrote a Spring AOP interceptor to wrap all the services that use the partitioned database. The interceptor was able to use the GLUD database (via an intermediary GludService) to determine which partition to route to.
那么谁来管理这个分区计算呢?我们写了一个spring aop拦截器来包装所有使用分区数据库的服务。这个拦截器能使用GLUD数据库,通过一个中间件 GludService来决定路由到哪个分区。
The final question was, how does the interceptor know which user the current operation is associated with? Rather than rely on magic, we made a decision that the first parameter of each method call that should be partition aware would identify the user: it would be either the User object itself, or the user's PK or email.
Any of these would serve to identify the parition that the data was in. This is a leaky abstraction:
the presence of the partitioning system now manifests in goofy method signatures in services that use the partitioned database. Here is how a method in the interceptor might look:
Here datasourceNumberCache is a public static final ThreadLocal<Integer> that holds the partition id for the user associated with this operation. We'll see who reads this ThreadLocal a little later on.We used the AspectJ pointcut language to describe our pointcuts. This allowed us to write type-safe method signatures for our interceptor, as you see above (no Method objects or Object[] of parameters). Also, we realized that many different types of interception would be necessary. Above we see the simplest case, that of looking up data associated with a user. But what if the user is updating her email (or other unique field held in the GLUD database)?
这里datasourceNumberCache一个public static final ThreadLocal<Integer>对象,来保存用户的partition id。我们将看到谁读了threadLocal对象。我们使用了AspectJ pointcut语言来描述我们的切点。这允许我们为我们的拦截器编写类型安全的方法签名。我们认识到许多不同的拦截类型将是必要的。上面我们看到了最简单的例子,如何按照用户查找数据。但是如果用户更新他的email,或者其他在GLUD数据库中唯一的字段。
What if we are creating a new user? What if it's an operation that should be "broadcast" to all partitions (count all the items created by all the users in the last week)? What if we need to load all user-generated content for an indexing process, and we need to batch load? All of these operations require different methods in the interceptor. How to bind the proper methods from the interceptor to the appropriate methods on the services?
For this we used annotations. You can see the annotation instance (yes, there are such things!) being passed in by the Spring infrastructure in the above method signature. Now you are prepared to appreciate the pointcut for the method in all its glory:
You can poke around the Spring docs and the AspectJ site to completely understand this, but basically it says "bind to any method annotated with "LocatePreexistingUser" and that has a User object as the first parameter". The "argNames" section was necessary to get the annotation and User object to be passed in correctly - something funky that as I remember only occurred if there was more than one argument binding in the pointcut or something like that; I just remember it was really difficult getting that to work properly, until I stumbled across the "argNames".
What's way cool about annotations used this way is that you can pass data from the annotated method to the
interceptor. For example, here's the definition of the above annotation:
Here, UserIdentifier is an enum with values USER_OBJECT, EMAIL, and USER_PK. If you are updating one of the
uniquely constrained fields held in the GLUD database ("email", for example), you can annotate the method on your
UserProfileService like this:
This is really nifty. You can pass in information to the interceptor that tells it how to process invocations of that
particular method, and the annotation that specifies that information is right there at the method definition. Again, I
think this is neat.
When Hibernate is ready to send some SQL to the database, it calls the datasource to get a connection. The
PartitionRoutingDataSource reads the partition id out of the ThreadLocal, and returns a connection to that database. It
extend's Springs AbstractRoutingDataSource and the operative method looks something like this:
Datasource configuration in Spring shows two normal datasources configured (for the two partition case),
user1DataSource and user2DataSource. These are standard lookups (we use JBoss's connection pools, looked up
through JNDI) pointing directly at the physical databases. However, the datasource that we feed to the (single)
Hibernate session factory is configured like this:
The Hibernate session factory is created in the usual way by Spring using this magical datasource:
Nothing special here. Now, to enable more database partitions, one simply configures the connection pools in the
appserver, adds the datasource to Spring, and then add the reference to the PartitionRoutingDataSource. Done! And
the nice thing is that you can handle arbitrary numbers of database partitions without creating a zillion session
There are some more interesting complications and details to worry about. For example, you want to make sure that
you've set the partition number before you open a transaction in Spring, as Hibernate sometimes aggressively grabs
a connection. In other words, you need to make sure that the "order" attribute on the DatasourceSwitchingAspect is
lower than the one on the transaction interceptor. Here, the DatasourceSwitchingAspect is set to order=1, and the
transaction interceptor is set to order=2.
The "userTxAdvice" is the usual old transaction advice in Spring ("<tx:advice ...>"). The transaction manager is a
normal HibernateTransactionManager.
One of the comments on Mark Fisher's blog indicated that this sort of configuration would be a mess with the second
-level cache in Hibernate unless the id spaces were kept separate. We thought of several ways to do this (assigning
a range to each database, etc) but the DBAs really liked the idea of a high-low table in GLUD, so we decided to
implement that. Hibernate has a high-low primary key generator, but it assumes that the sequence tables are in the
same database that you are inserting into, while ours were going to be in the GLUD database. To implement this
without writing our own high-low key generator required us to write a wrapper class for Hibernate's key generator. The
wrapper class simply grabs a session from the GLUD Hibernate session factory to send the to the key generator. The
GLUD session comes from an ApplicationContextAware Singleton object (gasp!) that holds a reference to the Spring
application context and grabs the GLUD session when necessary (the gludSessionFactory couldn't be dependency-
injected via Spring because Hibernate creates the key generators under the covers at an undisclosed location -
hence the resort to the Singleton (anti)pattern).
In our Hibernate mapping files, now the objects need to use this id generator class:
So overall, this partitioning scheme works pretty nicely, it's in production and seems to be fairly performant. However,
if you are thinking of implementing horizontal partitioning for your application, I'd like to point out several gotchas that
you need to know about.
2nd level cache
We have had a fair number of glitches with the Hibernate second-level cache. In short, faking out a Hibernate sesison
factory (which believes with all it's heart and soul that it's connected to a single database) to work with multiple
databases is fraught with peril. For object caching, it's generally ok, as our id space is unique and Foo#1 is only found
in one partition. However, query caching is a nightmare.
Let's say you issue a query "give me all the blog entries since last Sunday". First the query runs against parition 1, the
results are cached in the query cache. Then the interceptor attempts to run the query against partition 2, but since the
query is cached the same result set comes back. Those objects are not in partition 2, but they are in the cache, so in
general you end up getting dupes of everything in the first partition and nothing in later partitions. Think really hard
about any query caching you attempt in this scheme.
In general, you can be operating in a session attached to parition N, but working with object in partition M (because
you found them in the second-level cache). If Hibernate ever decides it wants to go to the db to fill out those objects,
you are hosed, because you are attached to the wrong db.
If you went with a Shards-style solution with one session factory (and hence one second-level cache) per database,
this sort of thing would be completely eliminated.
我们hibernate二级缓存上遇到些小问题。简单的说,因为hibernate session factory从核心上来说就是连接到单数据库上,
来看看一个请求"give me all the blog entries since last Sunday",首先这个请求在分区1运行,结果被缓存在query cache
中。接下来拦截器尝试到分区2中运行此sql请求,但是这个请求已经被query cache了,所以返回的只是第一个分区得到
的内容。在这个方案中实现query cache很难。
如果你使用Shards-style解决方案,使用单一session factory,因此每个数据库一个二级缓存,这一系列的问题可以完全
The system of the partitioned user databases and the GLUD database really form a unit: you don't want transactions
committing in one database and not committing in the other. If you want to wrap them in a single transaction, you
might want to use JTA. I'm not convinced JTA will work in this scheme. Imagine this scenario: you open your JTA-
transaction, and then touch the Hibernate session factory and talk to database partition 1. If you make some updates,
Hibernate generally holds onto the SQL until the end of the transaction. Now, in the same JTA transaction you talk to
partition 2. I can see one of two really bad things happening: 1) Hibernate says "hey I already have a connection in this
session" and uses the partition 1 connection, or 2) when the transaction commits, the session has SQL for partition 1
and partition 2 stored up. How does it know which statements to send to which partition?
There are some scattered statements in the Hibernate documentation that lead me think that you can configure
Hibernate to aggressively issue the SQL and release the connection on a statement-by-statement basis. However,
I've not verified this. I once attempted to set up a JTA transaction manager in our app but couldn't get it to work
(Spring's JtaTransactionManager refused to find JBoss's installed transaction manager). I only spent an hour or so
on it, and now I think I know what the problem was (duplicate jta.jar files in the classpath).
Again, in a one-session-factory-per-database style partitioning scheme, JTA should work fine.
使用Hibernate session factory并且向分区1写入数据。如果你使用更新,Hibernate 保持SQL 直到事务完成。在同一个JTA
Testing becomes painful (and this is not specific to our style of partitioning) when partitioning is involved. We've
evolved two types of tests against the partitioned databases: one ROIT (regular old integration tests) that test DAO's
against a single partition. Then we have the partitioned integration tests, which use GLUD and the partitions together.
You have to write these tests at least to test the interceptor, routing datasource, and all the XML config to make sure it
all works properly. However, it takes some creativity to set up the application context for these tests to avoid either
instantiating the entire world or writing duplicate configuration files for everything. Using DbUnit is a bit of a challenge
as well, as you generally need to insert/update data in both GLUD and the partitions (possibly several).
Shared Objects
Finally, one more pain point that you need to be aware of - objects with shared identity across partitions are a mess.
Let's say our Blog object has a Category, and it's a many-to-many relationship. If you want the Category objects in the
same database as the Blogs, they need to appear in each partitioned database. So Category(id=1, name="java")
appears in two databases. When they are loaded into the second-level cache they will fight to the death and visit pain
and suffering upon all transactions that dare to visit there. You could turn off caching for these things, turn off
optimistic locking (version), put them in another database (GLUD?). Again, if you had separate session factories (with
the concommitant separate second-level caches) this sort of thing wouldn't hurt so bad.
many-to-many 关系。如果你希望这个Category 对象和blog对象在同一个数据库中,他们需要在每个分区数据库中存在。
乐观锁,将他们放入另一个数据库GLUD?再次,如果你有分离的session factories 这些问题就不会这么坏了。
I hope you found this description interesting. If you are going to partition, you might want to consider doing it this way.
However, do be aware of all the problems noted above: the second-level cache (and possibly JTA) will not quite work
correctly. The one-session-factory-per-database configuration will consume more resources (and slow down the
startup of the app) but would most likely solve these issues.
I think if I were to start it again I'd go back to the original way of multiple session factories (but god how I hate watching
them all start up when I bounce my app server). Or I'd check to see what's up with Shards these days, that might even
be better. We may yet change to one of those methods because of the issues noted above. Nevertheless, it's quite an
interesting challenge to implement partitioning with these technologies. Let me know what your experiences are!
我想如果我重新开始,我更愿意到原始的多session factories方式,但是我讨厌看到他们在我启动应用服务器时都启动
它们。或者我会看看Shards 这几天怎么样了。然而,通过这些技术实现分区确实是一件有挑战性的任务。期望我能分享
关键字: Database
About a year ago we decided to scale our database horizontally - that is, partition it. We had many millions of users in the database, and we were contemplating allowing a lot more user-generated content on our site, as well as collecting much more data on what our users were doing. We had been burned many times by the vertical-scaling strategy ("buy a bigger box") - it's harder and harder to get the money for the next bigger box, you can only get one or two big boxes at a time, everything ends up on that box ("it's the only box powerful enough"), and when it crashes the entire world goes down. So we decided to partition horizontally with commodity hardware.
A MySql consultant specializing in scalability recommended that we partition horizontally based on user: a user and all her data (profile, user-generated content, etc) would be held in a particular partition. A global user database (GLUD) would be key to this array of databases: GLUD would store each user's primary key and the partition ID that the user resided in.
一个用户统一数据库应该是这组数据库的关键。GLUD将存储每个用户的主键和用户所在数据库区域的partition ID
So we went to work. Our initial idea was to create a Hibernate session factory for each partition. Let's say we have two user databases, user1 and user2. Then we'd have two session factories, one for each database. The services that used those databases (eg, the ProfileService) would have one instance created for each database. Profile1Service would connect to profile1Dao, which would use user1SessionFactory. Repeat for N partitions. Calls to the service would encounter a Spring AOP interceptor that would grab the user's identifier, call the GLUD to determine which partition the user's data was in, and then route the call to the correct instantance of the ProfileService.
我们开始实现。我们最初的想法是为每个分区分别创建一个hibernate session factory。我们有2个用户数据库,user1,user2。接下来我们有2个session factories,分别指向2个数据库。Profile1Service连接profile1Dao,连接user1SessionFactory。几个区域就重复几次。调用这个service将遇到Spring AOP 拦截器获得用户的ID,查询GLUD数据库来判断用户数据在哪个区块,然后路由ProfileService调用到正确的数据库。
We implemented a prototype of this, and it worked ok. Then we came across two ideas. First was a blog by Interface21's Mark Fisher where he introduced the AbstractRoutingDataSource. The second was Hibernate Shards. Mark's scheme would have us create only one ProfileService, one ProfileDao, and one UserSessionFactory, and have the datasource be aware of the multiple user databases. Hibernate Shards was a project that just was released that worked similarly to our first idea, creating a separate session factory instance for each database.
我们实现了这个原型并且它工作的还不错。接着我们遇到了2个想法。第一个是Interface21's Mark 的blog中提到的AbstractRoutingDataSource 。第二个是Hibernate Shards。 Mark's的方案使我们只需要创建一个ProfileService,一个ProfileDao,一个UserSessionFactory,并且使数据库知道多个用户表。Hibernate Shards是一个类似我们第一个想法的项目,为每一个数据库创建独立的session factory。
We really wanted to use Shards, rather than write our own partitioning system. But Shards had just been released as a beta. In the end we decided not to use Shards for several reasons: we watched for several weeks, but there was very little activity in the Shards source code repository. We didn't feel safe betting our core infrastructure on a project that was so new and uncertain. Second, the many-session-factory strategy is inherently unscalable: you need a new Hibernate session factory for each new database partition. If you become MySpace-like successful, you'd need a hundred session factories.
我们确实很希望使用Shards,而不是我们自己去实现一个分区系统。但是Shards只是一个beta版本。最终我们没有使用Shards,因为我们观察了一段时间,发现这个项目不太活跃,我们没有安全感,把我们的核心基础架构在这样新并且不确定的项目上。其次,多session-factory策略天生就不可扩展,你需要为每个数据库分区创建Hibernate session factory,如果你变为类似MySpace一样的成功,你需要数百个session factories。
Given all the literature talking about session factories being resource intensive, we weren't comfortable with that thought (this was also the rap on our initial stab at partitioning, above). Finally, looking at the initial Shards docs, it wasn't clear how one would integrate it and configure it with Spring. Spring's LocalSessionFactoryBean wouldn't work. I didn't relish the idea of digging into Spring's transaction infrastructure to build a ShardsSessionFactoryBean that properly integrated with the transaction management in Spring. So we decided to go with the routing-datasource method.
根据文献提到的session factories 消耗资源,我们认为这不是一个好的想法。最终查找最初的Shards 文档,它没有清楚的说明如果集成Shards以及在spring中配置它。Spring的LocalSessionFactoryBean无法工作。我没有深入的探索spring的事务基础结构来创建一个ShardsSessionFactoryBean来合适的集成事务管理。所以我们决定使用Routing datasource方法。
I'll take you through how we set this up, and then what we see as the pros and cons.
First is the GLUD database. This database contains the master_user table, which has the primary key and email address of all the users in all the partitions. In truth, it contains all the uniquely constrained atttibutes of a user, as it's the only place where a unique database constraint can be applied to the column, but for this explanation let's assume
the only unique field (other than PK) is email.
Given a user's email address, the master_user table can be used to locate the user's primary key. Another table is the partition_map. This contains a mapping of a hash of the user's primary key (PK) to a partition id. So once you have a user's PK, you hash it, then look up the partition in the partition_map. The hash function we used was just the last three digits of the PK, so that we allocated 1000 virtual partition. The number of physical partitions can be any number from 1 to 1000 in this scheme. For example, you could map paritions 000-499 to user database 1, and 500-999 to user database 2 if you only used two partitions (or you could go even/odd, or whatever). The point is now that once you have the user's PK or email, you can determine the partition id of the database that has her data.
我们通过master_user表email找到这个用户对应的primary key。另一个表是partition_map,这个表包含了一个hash的user primary key到partition id的映射。所以一旦你有一个用户的pk,对pk做hash操作,然后在partition_map表中查找用户的分区。 我们使用的Hash函数就是PK 的后三位,所以我们可以分配1000个虚拟分区。按照这个方案,物理分区可以是1到1000中的任意数字。如果你只有2个分区,你可以映射000-499结尾的user pk至database1,500-999至用户 database2。现在一旦你有用户的pk id或者email,你可以找到它的数据分区的database的id。
So who is in charge of doing this partition location calculation? We wrote a Spring AOP interceptor to wrap all the services that use the partitioned database. The interceptor was able to use the GLUD database (via an intermediary GludService) to determine which partition to route to.
那么谁来管理这个分区计算呢?我们写了一个spring aop拦截器来包装所有使用分区数据库的服务。这个拦截器能使用GLUD数据库,通过一个中间件 GludService来决定路由到哪个分区。
The final question was, how does the interceptor know which user the current operation is associated with? Rather than rely on magic, we made a decision that the first parameter of each method call that should be partition aware would identify the user: it would be either the User object itself, or the user's PK or email.
Any of these would serve to identify the parition that the data was in. This is a leaky abstraction:
the presence of the partitioning system now manifests in goofy method signatures in services that use the partitioned database. Here is how a method in the interceptor might look:
public Object selectExistingPartitionWithUser(ProceedingJoinPoint jp, LocatePreexistingUser annotation, User user) throws Throwable { GludEntry gludEntry = getGludService().getGludEntryForExistingUser(user); int partitionNumber = gludEntry.getDatabasePartition(); datasourceNumberCache.set(partitionNumber); Object returnValue = null; try { returnValue = jp.proceed(); } finally { datasourceNumberCache.remove(); } return returnValue; }
Here datasourceNumberCache is a public static final ThreadLocal<Integer> that holds the partition id for the user associated with this operation. We'll see who reads this ThreadLocal a little later on.We used the AspectJ pointcut language to describe our pointcuts. This allowed us to write type-safe method signatures for our interceptor, as you see above (no Method objects or Object[] of parameters). Also, we realized that many different types of interception would be necessary. Above we see the simplest case, that of looking up data associated with a user. But what if the user is updating her email (or other unique field held in the GLUD database)?
这里datasourceNumberCache一个public static final ThreadLocal<Integer>对象,来保存用户的partition id。我们将看到谁读了threadLocal对象。我们使用了AspectJ pointcut语言来描述我们的切点。这允许我们为我们的拦截器编写类型安全的方法签名。我们认识到许多不同的拦截类型将是必要的。上面我们看到了最简单的例子,如何按照用户查找数据。但是如果用户更新他的email,或者其他在GLUD数据库中唯一的字段。
What if we are creating a new user? What if it's an operation that should be "broadcast" to all partitions (count all the items created by all the users in the last week)? What if we need to load all user-generated content for an indexing process, and we need to batch load? All of these operations require different methods in the interceptor. How to bind the proper methods from the interceptor to the appropriate methods on the services?
For this we used annotations. You can see the annotation instance (yes, there are such things!) being passed in by the Spring infrastructure in the above method signature. Now you are prepared to appreciate the pointcut for the method in all its glory:
@Around(value="@annotation(annotation) && args(user, ..)", argNames="annotation,user")
You can poke around the Spring docs and the AspectJ site to completely understand this, but basically it says "bind to any method annotated with "LocatePreexistingUser" and that has a User object as the first parameter". The "argNames" section was necessary to get the annotation and User object to be passed in correctly - something funky that as I remember only occurred if there was more than one argument binding in the pointcut or something like that; I just remember it was really difficult getting that to work properly, until I stumbled across the "argNames".
What's way cool about annotations used this way is that you can pass data from the annotated method to the
interceptor. For example, here's the definition of the above annotation:
@Retention(RetentionPolicy.RUNTIME) @Target(ElementType.METHOD) public @interface LocatePreexistingUser { public UserIdentifier userIdentifier() default USER_OBJECT; public boolean userUpdate() default false; }
Here, UserIdentifier is an enum with values USER_OBJECT, EMAIL, and USER_PK. If you are updating one of the
uniquely constrained fields held in the GLUD database ("email", for example), you can annotate the method on your
UserProfileService like this:
@LocatePreexistingUser(userUpdate=true) public void updateEmail(User user, String newEmail) { ... } And then the interceptor can contain code like this: if(annotation.userUpdate) { // tell GLUD service to update its master_user record }
This is really nifty. You can pass in information to the interceptor that tells it how to process invocations of that
particular method, and the annotation that specifies that information is right there at the method definition. Again, I
think this is neat.
When Hibernate is ready to send some SQL to the database, it calls the datasource to get a connection. The
PartitionRoutingDataSource reads the partition id out of the ThreadLocal, and returns a connection to that database. It
extend's Springs AbstractRoutingDataSource and the operative method looks something like this:
protected Object determineCurrentLookupKey() { Integer datasourceNumber = DatasourceSwitchingAspect.datasourceNumberCache.get(); return datasourceNumber; }
Datasource configuration in Spring shows two normal datasources configured (for the two partition case),
user1DataSource and user2DataSource. These are standard lookups (we use JBoss's connection pools, looked up
through JNDI) pointing directly at the physical databases. However, the datasource that we feed to the (single)
Hibernate session factory is configured like this:
<bean id="userDataSource" class="PartitionRoutingDataSource"> <property name="targetDataSources"> <map key-type="java.lang.Integer"> <entry key="1" value-ref="user1DataSource"/> <entry key="2" value-ref="user2DataSource"/> </map> </property> </bean>
The Hibernate session factory is created in the usual way by Spring using this magical datasource:
<bean id="userSessionFactory" class="org.springframework.orm.hibernate3.LocalSessionFactoryBean"> <property name="dataSource" ref="userDataSource"/> etc etc </bean>
Nothing special here. Now, to enable more database partitions, one simply configures the connection pools in the
appserver, adds the datasource to Spring, and then add the reference to the PartitionRoutingDataSource. Done! And
the nice thing is that you can handle arbitrary numbers of database partitions without creating a zillion session
There are some more interesting complications and details to worry about. For example, you want to make sure that
you've set the partition number before you open a transaction in Spring, as Hibernate sometimes aggressively grabs
a connection. In other words, you need to make sure that the "order" attribute on the DatasourceSwitchingAspect is
lower than the one on the transaction interceptor. Here, the DatasourceSwitchingAspect is set to order=1, and the
transaction interceptor is set to order=2.
<aop:config> <aop:pointcut id="profileServicePointcut" expression="execution(* *..ProfileService.*(..))"/> <aop:advisor advice-ref="userTxAdvice" pointcut-ref="profileServicePointcut" order="2"/> </aop:config>
The "userTxAdvice" is the usual old transaction advice in Spring ("<tx:advice ...>"). The transaction manager is a
normal HibernateTransactionManager.
One of the comments on Mark Fisher's blog indicated that this sort of configuration would be a mess with the second
-level cache in Hibernate unless the id spaces were kept separate. We thought of several ways to do this (assigning
a range to each database, etc) but the DBAs really liked the idea of a high-low table in GLUD, so we decided to
implement that. Hibernate has a high-low primary key generator, but it assumes that the sequence tables are in the
same database that you are inserting into, while ours were going to be in the GLUD database. To implement this
without writing our own high-low key generator required us to write a wrapper class for Hibernate's key generator. The
wrapper class simply grabs a session from the GLUD Hibernate session factory to send the to the key generator. The
GLUD session comes from an ApplicationContextAware Singleton object (gasp!) that holds a reference to the Spring
application context and grabs the GLUD session when necessary (the gludSessionFactory couldn't be dependency-
injected via Spring because Hibernate creates the key generators under the covers at an undisclosed location -
hence the resort to the Singleton (anti)pattern).
public class UserDbIdGenerator implements IdentifierGenerator, Configurable { private MultipleHiLoPerTableGenerator generator; public ProfileIdGenerator() { generator = new MultipleHiLoPerTableGenerator(); } public Serializable generate(SessionImplementor profileSession, Object entity) throws HibernateException { SessionFactory gludSessionFactory = getGludSessionFactory(); Session gludSession = gludSessionFactory.openSession(); Transaction txn = gludSession.beginTransaction(); // Pass through to the wrapped id generator Long key = (Long) generator.generate((SessionImplementor) gludSession, entity); txn.commit(); gludSession.close(); return key; } protected SessionFactory getGludSessionFactory() { SessionFactory sessionFactory = SpringContextSingleton.getInstance().getBean("gludSessionFactory"); return sessionFactory; } public void configure(Type type, Properties props, Dialect dialect) throws MappingException { generator.configure(type, props, dialect); } }
In our Hibernate mapping files, now the objects need to use this id generator class:
<class name="Foo" table="foo"> <id name="id" column="id"> <generator class="UserDbIdGenerator"> <param name="primary_key_value">foo</param> <param name="max_lo">5000</param> </generator> </id> </class>
So overall, this partitioning scheme works pretty nicely, it's in production and seems to be fairly performant. However,
if you are thinking of implementing horizontal partitioning for your application, I'd like to point out several gotchas that
you need to know about.
2nd level cache
We have had a fair number of glitches with the Hibernate second-level cache. In short, faking out a Hibernate sesison
factory (which believes with all it's heart and soul that it's connected to a single database) to work with multiple
databases is fraught with peril. For object caching, it's generally ok, as our id space is unique and Foo#1 is only found
in one partition. However, query caching is a nightmare.
Let's say you issue a query "give me all the blog entries since last Sunday". First the query runs against parition 1, the
results are cached in the query cache. Then the interceptor attempts to run the query against partition 2, but since the
query is cached the same result set comes back. Those objects are not in partition 2, but they are in the cache, so in
general you end up getting dupes of everything in the first partition and nothing in later partitions. Think really hard
about any query caching you attempt in this scheme.
In general, you can be operating in a session attached to parition N, but working with object in partition M (because
you found them in the second-level cache). If Hibernate ever decides it wants to go to the db to fill out those objects,
you are hosed, because you are attached to the wrong db.
If you went with a Shards-style solution with one session factory (and hence one second-level cache) per database,
this sort of thing would be completely eliminated.
我们hibernate二级缓存上遇到些小问题。简单的说,因为hibernate session factory从核心上来说就是连接到单数据库上,
来看看一个请求"give me all the blog entries since last Sunday",首先这个请求在分区1运行,结果被缓存在query cache
中。接下来拦截器尝试到分区2中运行此sql请求,但是这个请求已经被query cache了,所以返回的只是第一个分区得到
的内容。在这个方案中实现query cache很难。
如果你使用Shards-style解决方案,使用单一session factory,因此每个数据库一个二级缓存,这一系列的问题可以完全
The system of the partitioned user databases and the GLUD database really form a unit: you don't want transactions
committing in one database and not committing in the other. If you want to wrap them in a single transaction, you
might want to use JTA. I'm not convinced JTA will work in this scheme. Imagine this scenario: you open your JTA-
transaction, and then touch the Hibernate session factory and talk to database partition 1. If you make some updates,
Hibernate generally holds onto the SQL until the end of the transaction. Now, in the same JTA transaction you talk to
partition 2. I can see one of two really bad things happening: 1) Hibernate says "hey I already have a connection in this
session" and uses the partition 1 connection, or 2) when the transaction commits, the session has SQL for partition 1
and partition 2 stored up. How does it know which statements to send to which partition?
There are some scattered statements in the Hibernate documentation that lead me think that you can configure
Hibernate to aggressively issue the SQL and release the connection on a statement-by-statement basis. However,
I've not verified this. I once attempted to set up a JTA transaction manager in our app but couldn't get it to work
(Spring's JtaTransactionManager refused to find JBoss's installed transaction manager). I only spent an hour or so
on it, and now I think I know what the problem was (duplicate jta.jar files in the classpath).
Again, in a one-session-factory-per-database style partitioning scheme, JTA should work fine.
使用Hibernate session factory并且向分区1写入数据。如果你使用更新,Hibernate 保持SQL 直到事务完成。在同一个JTA
Testing becomes painful (and this is not specific to our style of partitioning) when partitioning is involved. We've
evolved two types of tests against the partitioned databases: one ROIT (regular old integration tests) that test DAO's
against a single partition. Then we have the partitioned integration tests, which use GLUD and the partitions together.
You have to write these tests at least to test the interceptor, routing datasource, and all the XML config to make sure it
all works properly. However, it takes some creativity to set up the application context for these tests to avoid either
instantiating the entire world or writing duplicate configuration files for everything. Using DbUnit is a bit of a challenge
as well, as you generally need to insert/update data in both GLUD and the partitions (possibly several).
Shared Objects
Finally, one more pain point that you need to be aware of - objects with shared identity across partitions are a mess.
Let's say our Blog object has a Category, and it's a many-to-many relationship. If you want the Category objects in the
same database as the Blogs, they need to appear in each partitioned database. So Category(id=1, name="java")
appears in two databases. When they are loaded into the second-level cache they will fight to the death and visit pain
and suffering upon all transactions that dare to visit there. You could turn off caching for these things, turn off
optimistic locking (version), put them in another database (GLUD?). Again, if you had separate session factories (with
the concommitant separate second-level caches) this sort of thing wouldn't hurt so bad.
many-to-many 关系。如果你希望这个Category 对象和blog对象在同一个数据库中,他们需要在每个分区数据库中存在。
乐观锁,将他们放入另一个数据库GLUD?再次,如果你有分离的session factories 这些问题就不会这么坏了。
I hope you found this description interesting. If you are going to partition, you might want to consider doing it this way.
However, do be aware of all the problems noted above: the second-level cache (and possibly JTA) will not quite work
correctly. The one-session-factory-per-database configuration will consume more resources (and slow down the
startup of the app) but would most likely solve these issues.
I think if I were to start it again I'd go back to the original way of multiple session factories (but god how I hate watching
them all start up when I bounce my app server). Or I'd check to see what's up with Shards these days, that might even
be better. We may yet change to one of those methods because of the issues noted above. Nevertheless, it's quite an
interesting challenge to implement partitioning with these technologies. Let me know what your experiences are!
我想如果我重新开始,我更愿意到原始的多session factories方式,但是我讨厌看到他们在我启动应用服务器时都启动
它们。或者我会看看Shards 这几天怎么样了。然而,通过这些技术实现分区确实是一件有挑战性的任务。期望我能分享
2011-01-02 16:08 1620Membase is a distributed key-va ... -
可靠、高性能的 TCP/HTTP 负载均衡器
2009-08-12 10:09 1553HAProxy 可靠、高性能的 TCP/HTTP 负载均衡器 ... -
Welcome to Solr
2009-03-07 19:46 1196Welcome to Solr http://lucene.a ... -
Hibernate Shards 概略
2009-03-05 10:12 2173来自 hibernate_shards中文参考指南 分片策略 ... -
2008-05-12 16:52 1382可以令操作系统在一个守护程序死亡时自动重启它。 方法是将此可执 ... -
build a highly available cluster [1]
2008-05-12 15:21 1254最近在读Karl Kopper 用商业硬件和免费软件构建高可用 ... -
2007-12-15 23:58 1945http://forum.springside.org.cn/ ... -
Google Code for Educators
2007-12-14 23:11 1263Google: Cluster Computing and M ... -
Sharding the Hibernate Way
2007-12-14 15:34 2056http://highscalability.com/shar ... -
Tailrank Architecture - Learn How to Track Memes Across the
2007-12-11 16:24 1456转自:http://www.highscalability.c ... -
How To Setup MogileFS
2007-12-09 19:31 145Getting MogileFS $ mkdir mogil ... -
HA-JDBC: High-Availability JDBC
2007-12-09 03:27 5018数据库集群好伙伴 Overview HA-JDBC is a ... -
Hibernate Search 3.0.0.GA offers two back ends
2007-12-09 02:30 21472.2.1. Lucene In this mode, all ... -
Hibernate Shards 3.0.0.Beta2存在的限制
2007-12-09 02:22 2592来源 Hibernate Shards docs 6.1. ... -
Using Master/Slave Replication with ReplicationConnection
2007-12-04 12:03 1939Starting with Connector/J 3.1.7 ... -
无共享架构(Share Nothing Architecture)
2007-06-22 09:35 8774关于集群的补课 (转) http://www.blogjav ...
Automation Control for Flow Wrappers/Horizontal Form, Fill and Seal (HFFS) Machinespdf,Automation Control for Flow Wrappers/Horizontal Form, Fill and Seal (HFFS) Machines
SpringContainer SpringContainer is a FrameLayout that supports ...And SpringContainer does not interfere the horizontal scrolling of child views. In the following demo, a HorizontalScrollView is put
【Horizontal Tree】是一种独特的树形布局方式,与传统的自顶向下垂直展示的树形图不同,它采用水平方向展示节点关系,形似家族谱。在数据可视化领域,这种布局方式常用于显示层次结构清晰且横向扩展较多的数据,...
Currently, the CAS-PEAL face database contains 99,594 images of 1040 individuals (595 males and 445 females) with varying Pose, Expression, Accessory, and Lighting (PEAL). For each subject, 9 cameras ...
Implement entity validation, full-text search, horizontal partitioning (sharding), and spatial queries using NHibernate Contrib projects Approach This book contains quick-paced self-explanatory ...
【标题】"mobile-iPhone-Horizontal" 指的是一个设计资源集合,专门针对iPhone设备的横向布局。在移动应用设计中,设备的横屏模式往往用于展示更宽广的内容或者提供不同的交互体验,比如游戏、地图应用或者多媒体...
npm install --save react-scroll-horizontal这个怎么运作为<HorizontalScroll>喂一个或多个孩子。 只要它们具有固定的宽度,该组件将负责其余部分。 注意:子项的宽度必须大于<HorizontalScroll>的宽度用法npm i ...
What you will learnSet up a working development environment and create a simple web service to demonstrate the basicsLearn how to make your service more usable by adding a database and an app server ...
《横向热壁反应器中SiC-CVD生长与掺杂模型》 硅碳化物(SiC)在后硅时代被视为一种有潜力的功率器件材料,因其宽的能隙和高的电击穿场强而备受青睐。SiC化学气相沉积(CVD)是制造器件的关键工艺之一。...
With this authoritative introduction, you'll learn the many advantages of using document-oriented databases, and discover why MongoDB is a reliable, high-performance system that allows for almost ...
What you will learnSet up a working development environment and create a simple web service to demonstrate the basicsLearn how to make your service more usable by adding a database and an app server ...
and horizontal alignments. These geometric parameters have several geospatial applications such as road safety management. The purpose of this book is to promote the core understanding of suitable ...
Wind Turbine Blade design and optimization with OpenGL (horizontal and vertical axis) Storing of projects, rotors, turbines and simulations in a runtime database Horizontal and Vertical Axis Wind ...
激光雷达(Lidar)与能见度仪水平路径能见度测量对比实验与分析的知识点总结: 一、能见度的定义与重要性 能见度是指观测者能通过大气看清远处物体的最大距离,它对于航空、航海、高速公路和军事操作至关重要。...
### Motion Adaptive Interpolation with Horizontal Motion Detection for Deinterlacing #### 概述 本文献介绍了一种用于将交错扫描视频转换为逐行扫描格式的新方法,这种方法被称为运动自适应去隔行算法。该...
Android-Horizontal-Calendar.zip,基于recyclerview的android材料水平日历视图,安卓系统是谷歌在2008年设计和制造的。操作系统主要写在爪哇,C和C 的核心组件。它是在linux内核之上构建的,具有安全性优势。
Vertical and Horizontal cDNA Subtractions Reveal Tissue- Specific Unigenes for Potato Tuberization,姚新灵,洪志平,Potato tuberization is a unique biologic process in Solanaceous plants. However, a ...