- 浏览: 853741 次
- 性别:
- 来自: 北京
文章分类
最新评论
-
zjhzwx1212:
为什么用threadLocal后,输出值是从20开始的,而定义 ...
j2ee的线程安全--threadlocal -
aeoluspu:
不错 mysql 测试部分感觉不详细
用sysbench(或者super-smack)测试mysql性能 -
nanPrivate:
有没有例子,只理论,实践起来还是不会啊
JMS可靠消息传送 -
lwclover:
一个网络工程师 装什么b
postfix 如何删除队列中的邮件 -
maimode:
我也欠缺不少啊
理想的计算机科学知识体系
http://gojko.net/2009/06/23/improving-performance-and-scalability-with-ddd/
Distributed systems are not typically a place where domain driven design is applied. Distributed processing projects often start with an overall architecture vision and an idea about a processing model which basically drives the whole thing, including object design if it exists at all. Elaborate object designs are thought of as something that just gets in the way of distribution and performance, so the idea of spending time to apply DDD principles gets rejected in favour of raw throughput and processing power. However, from my experience, some more advanced DDD concepts can significantly improve performance, scalability and throughput of distributed systems when applied correctly.
This is a summary of my presentation titled “DDD in a distributed world” at the DDD Exchange 09 in London. You can also download the slides accompanying this presentation .
One of the most important building blocks of DDD that can help in distributed systems are aggregates. Unfortunately, at least judging by the discussions that I’ve had with client teams over the last few years, aggregates seem to be one of the most underrated and underused building blocks of DDD. I’m probably as guilty as anyone else of misusing aggregates and it took me quite a while to grasp the full potential of that concept. But from this perspective I think that getting aggregates just right is key to making a distributed system work and perform as expected.
Aggregates are about units of consistency – not about pointers
Most folks think about aggregates in terms of pointers, such as that objects outside of an aggregate should only reference an aggregate root and not any objects inside the aggregate. Although very useful in general, this rule is not the whole story of aggregates, but that is where most discussions end. Aggregates are also perceived as very technical in nature, almost something that should be left out of the discussion with business experts, similar to repositories or factories. I think that this is a mistake. Aggregates represent a very clear business domain aspect that should definitely be discussed with domain experts. I found it much more important to focus on the fact that an aggregate is a unit of consistency from a business perspective.
We can look at an aggregate from a business perspective as a units of consistency in our problem domain. Something that in a business sense always needs to be perfectly coordinated and consistent, something that is only complete as a whole. Eric Evans calls this a “meaningful whole”, suggesting that an aggregate itself has a meaning separate from the meaning of its parts. As such, it is a genuine business domain concept, not just a technical guideline about references. And in order to find correct aggregate boundaries we shouldn’t be concerned that much about referencing code and class encapsulation. This is more an exercise of identifying business requirements for consistency.
From that perspective, aggregates have a clear impact on processing models and implementation of the system. An aggregate is a natural unit of distribution because we probably need all the information in an aggregate to meaningfully process it. Whenever we send information down the wire, the whole aggregate needs to go. It also means that we need to ensure that all changes to a single aggregate are saved correctly as part of a single transaction. Once we have done it, we genuinely have a meaningful piece of work done. This potentially gives us freedom to work on different aggregates in parallel or in separate transactions.
When these forces are aligned, we’ll get a very nice distributed system design and an implementation that performs nicely. When these forces are in conflict, we should stop and analyse the system instead of just ploughing away with the code. The conflict may come from either the technical or the business perspective but having these concepts explicitly defined gives us a good framework to spot something smelling dirty and fix the problem before it becomes too big. That is why aggregate models provide some great hints about distributed processing.
Hint #1: If you don’t aggregate enough, even the fastest computers will spin their wheels in vain
Latency is one of those classical problems of distributed systems. It is a very elusive concepts that an average programmer just ignores, because it seems as a networking issue that should be solved by networking people. I remember a guy who was building his own messaging solution and proudly boasted about its performance, demonstrating it by running both the server and the clients on his machine. When we deployed it properly and put one system in the UK and another one in the States, the whole thing fell over almost instantly.
Think of it this way: to send an page by post you need to find an envelope, put a stamp on that and put it into a post box. There’s also one less visible step there that most people don’t take into account – walking out of the house and going to the post box. Walking outside to a post box is where the latency kicks in. It’s not something people see integral to the process, but it nevertheless exists. Say you wanted to post a contract that has twenty pages. Doing the whole routine for every single page would instinctively feel wrong and no sane person would do that – we would just put all the pages in a large envelope and stick a single stamp on that, then take all that the post box at once. Yet I find people doing the equivalent of posting each page separately in software over and over again, making the system walk to the post box twenty times instead of once and paying for twenty virtual postage stamps. And this is where aggregates start to kick in. The whole twenty page contract is an aggregate and that very fact that is an aggregate should suggest that it should be shipped together. It is a meaningful whole so whoever needs a page probably needs the entire thing.
So I suggest the first rule to help distributed systems with aggregates: Ship entire aggregates straight away to avoid latency .
Hint #2: If you aggregate too much, baggage becomes a problem
The second classic problem of distributed systems is serialization performance. Serialization and deserialization are often among the biggest performance hogs in a system as they rely on reflection, IO and traversing complicated object graphs. This is again where aggregates can help, but the issue is a bit more subtle.
We need to keep an eye out for conflicts between the intended processing model and the business view of aggregates. For example, in a classic account system with accounts and transaction histories, it seems perfectly logical to have the account history as part of the account aggregate.
Say we wanted to process transactions for an account on different machines. Sending huge object graphs over the wire all the time makes no sense technically, so shipping the whole history when we send the account would kill the system performance. There we have a conflict between a business view of an aggregate.
Instead of just ploughing away with the code and breaking the aggregate apart for this one occasion (which would lead to the corruption of the design eventually), I suggest pausing and thinking a bit more about this conflict. This is a very subtle trap. First of all, there are no universal models and the model along with the aggregate structure only works within a particular set of scenarios, so stuffing the account history to be part of the account aggregate may or may not make sense depending on how we are going to use the accounts. When there is a conflict between the processing model that we wanted to implement and the aggregate model suggested by the business, that might point to better ways to design the system.
The rule that the whole aggregate gets shipped together would make us ship the whole history each time we send out account information and makes absolutely no sense. On the other hand, being able to meaningfully operate on just a few pieces of information conflicts with the business definition of the aggregate.
An option would be that the history is not really be part of the account aggregate – maybe it is just a list of domain events that record what happened, or maybe they form part of another statement aggregate. In that case, the model gives us a real argument to talk to the business about statements being something that can now get updated asynchronously as it is not part of the main account aggregate. This is why you get the correct balance but not all the statements on in internet banking applications. Inserting a new transaction record might require locking the sequence generator or a table in some database systems, so it costs a lot. If the transactions can be asynchronously updated then we can significantly reduce locking and improve the system performance. This model gives us distributed processing but clearly points out that the history is not going to be 100% consistent with the balance all the time. It would require a recognition from the domain experts that the history does not belong to the unit of consistency of an account.
On the other hand, if we don’t get the confirmation from the business about that, the fact that all of it has to be part of the same aggregate tells us that distributed processing of a single account is not where we want to go. We need to find a different solution, possibly a distributed master-worker with worker machines owning accounts and the master routing tasks to be processed on the appropriate machine.
So the second rule I’d like to suggest is: If you don’t need it all, it is probably not an aggregate and you can improve performance with asynchronous updates.
Hint #3: If you don’t aggregate enough, things are going to be broken just when you need them
This relationship between synchronous atomic updates and aggregates works both ways. Without properly defined aggregate boundaries in the model, we might actually be breaking up transactions into pieces that do not make sense individually. If there are no aggregate boundaries, there is nothing to tell us what actually is a consistent update from a domain perspective and we’ll probably form transactions around requests. This might lead to a lot of unnecessary code to handle technical “edge cases” which aren’t really business exceptions at all.
For example, a trading system might require customer’s credit card details for any meaningful operation. Because of web design constraints, the registration process for customers on the web site might be divided into several steps, where they would first fill in the personal details and then put in the payment details on a separate screen.
If we don’t capture the fact that the payment information are actually part of the customer aggregate then the normal way to implement this would be to form the transactions around requests – posting after each screen. That would allow the system to create customers with no payment information (when they close the browser or just don’t complete the registration). These strange cases will start popping up in reports and we’d have to implement workflows to handle these customers, preventing them from accessing functionality that requires payments etc.
If we do consider both parts a single aggregate, the rule that we should ship entire aggregates (even from the web site to create the customer account), leads to the conclusion that the data should be collected from both screens first and sent at once, instead of sent after each individual screen and processed in the database.
Because aggregates are units of consistency, in order for them to stay consistent we need to keep all steps in an update of an aggregate together. Aggregate updates need to obey the samurai principle: either return successfully or not return at all. Without this, we are opening the doors for data corruption – not in the disk failure sense but in the business sense. The technical edge cases that don’t exist in the business simply go away, so we don’t have to handle them or maintain code to work around them.
Atomicity of transactions fits into aggregates really nicely and this is where transactions also come in. If we think of aggregates as units of consistency, that allows us to design the transactions in our system to work on a single machine as we can ensure that all the information that should be processed synchronously within a transaction resides in a single place. Aggregates are natural atomic parts of transactions.
The third rule I suggest for improving distributed systems is: We should form our transactions around business units of consistency rather than technical requests to avoid data corruption and simplify processing. The entire aggregate should be processed synchronously within a transaction.
Hint #4: If you aggregate the wrong things, deadlocks galore
Locking resources in distributed systems is also one of those invisible issues that can just kill you if you don’t get it right. Distributed systems are typically built to process more than a single machine can – to increase the overall system throughput. Yet if critical resources are locked too much and processes often have to wait for other processes to release those locks, spanning to multiple machines might actually give you worse performance.
One example that we had in a recent DDD Immersion workshop , changed a bit to protect the innocent, is a scheduling system that involves event organisers and events. The unit of consistency in a business sense was an event organiser, so that was the aggregate by design. At some point, the system was integrated with an external notification source with messages about changes to events. A message might contain updates for hundreds of events for different organisers, postponing, rescheduling etc. The business wanted a notification message to be processed within a single transaction, regardless of what it contains. After the change, the system performance seriously deteriorated as a single notification message was locking hundreds of organiser objects.
This is another example of the processing model conflicting with the business model of aggregates. If the organiser and its events are units of consistency, then we should be able to update one organiser entirely and not update the others and the system would still be consistent. If this is confirmed by the business (which is more likely than not), then we have a good case against the requirement to process the entire message in a single transaction. We can take the message, store it and commit, then break it apart into chunks related to individual organisers in another transaction and commit that, and then process individual chunks in separate transactions. This model also allows us to process individual organisers in parallel so the business will probably see the final effect sooner than with a single transaction anyway.
The fourth rule coming out of this is : When locking, lock aggregate roots. Challenge requests to lock more than one at the same time.
Reinforcing ideas
Aggregates are a curious pattern as they clearly have both a business domain and a technical side. These two sides of aggregates confirm and reinforce each-other. That is what makes them so useful for modelling and designing distributed systems. Once we know what makes a conceptual unit of consistency, we have defined the clusters that need to travel down the wire or be processed as a single unit. If the required processing model contradicts that, then we need to revisit the aggregate definitions or the processing model.
Once we draw the lines of the aggregates and think of them primarily as units of consistency we can have a meaningful discussion on what implementation model suits our required set of scenarios. Aggregates are natural choices for some of the most tricky issues of distributed systems, including locks, transactions, asynchronous processing, serialization and latency.
If we want one but not the other that probably means that we don’t have the correct aggregates boundary, or that we have unrealistic expectations. And spotting this early on is one of the greatest powers of aggregates as a concept – we don’t have to burn our fingers by spending months developing something that simply is not going to perform or satisfy requirements.
I took notes during other presentations and will post them soon. See earlier DDD exchange posts and subscribe to the RSS feed to get notified when the new ones come online.
发表评论
-
Java EE meets Web 2.0
2010-09-19 14:56 2514http://www.ibm.com/developerwor ... -
最全面的Hibernate 性能优化技巧
2010-08-19 16:28 1193http://www.duka ... -
必备的 Java 参考资源列表
2010-03-16 11:42 1451Java™ 平台不久将迎来 ... -
CWE/SANS发布2010年25个最危险的编程错误
2010-02-26 09:05 1253http://www.infoq.com/cn/news/20 ... -
一个程序员的多年珍藏(Java&Linux)
2010-01-15 17:25 5477http://jythoner.iteye.com/blog/ ... -
Java 推荐读物与源代码阅读
2010-01-15 17:24 1163http://jythoner.iteye.com/blog/ ... -
A Spring Security ACL tutorial for PostgreSQL
2010-01-01 22:48 1546http://server.denksoft.com/word ... -
ddd using jpa
2009-12-28 10:00 1091http://www.iteye.com/topic/6540 ... -
java的ddd框架
2009-12-28 09:43 2727SpringXT是Spring框架的一个扩展用于开发riche ... -
hibernate session cache tips
2009-12-27 22:13 10111、只有当通过主键load或者get时,hibernate才不 ... -
事物tips
2009-12-27 20:27 9801、只读标志只在事务启动时应用。不启动任何事务,则只读标志被忽 ... -
常用正则表达式
2009-12-24 10:11 1024Email : /^\w+([-+.]\w+)*@\w+([- ... -
hibernate batch insert 和id策略
2009-11-17 10:40 1159在id生成策略为native的情况下,batch insert ... -
两个好用的java反编译工具
2009-11-09 09:54 906jad jd-gui -
Design Patterns in Dynamic Programming
2009-11-07 23:04 1077http://norvig.com/design-patter ... -
几个好用的开源工具
2009-11-02 14:20 1179DBDesigner4 一款开源的数据库设计、建模、维 ... -
持续集成的极好例子
2009-10-28 09:56 1981http://www.iteye.com/topic/4993 ... -
fix tomcat memory settings
2009-10-27 16:10 973-Dorg.apache.jasper.runtime.Bod ... -
CI和maven私服
2009-10-22 18:56 1082Nexus quickbuild hudson -
异构数据库复制工具
2009-10-22 15:27 1776实现常见的数据库之间的复制: 一、 Daffodil Re ...
相关推荐
Improving .NET Application Performance and Scalability provides an approach to engineering applications for performance and scalability.
Improving .NET Application Performance and Scalability <br>http://msdn.microsoft.com/en-us/library/ms998530.aspx
提供有关在整个.NET应用程序生命周期中管理性能和可伸缩性的端到端指南,以降低风险并降低总体拥有成本。
Assessing and Improving Prediction and Classification Theory and Algorithms in C++ 英文epub 本资源转载自网络,如有侵权,请联系上传者或csdn删除 查看此书详细信息请在美国亚马逊官网搜索此书
Designing with Data: Improving the User Experience with A/B Testing by Rochelle King English | 29 Mar. 2017 | ASIN: B06XY9TTN8 | 370 Pages | AZW3 | 4.61 MB On the surface, design practices and data ...
藏经阁-Improving Python and Spark Performance and Interoperability 在本篇文章中,我们将讨论如何提高 Python 和 Spark 的性能和互操作性。Spark 是一个基于 Scala 编写的分布式计算框架,运行在 Java 虚拟机...
You want to use distributed computations in parallel fashion to gain high performance, low latency, and linear scalability. You want to accelerate applications performance without changing code. What...
* Improving performance and scalability using threads, thread pools, and completion ports * Techniques to improve program reliability and performance in all systems * Windows performance-enhancing API...
该标准文档的标题“Improving the Lightning Performance”(提升雷电性能)主要指通过一系列技术和方法来提高电力系统中架空配电线路对于雷电冲击的抵抗能力。在电力行业中,“雷电性能”的提升通常涉及到对现有...
* Improving performance and scalability using threads, thread pools, and completion ports * Techniques to improve program reliability and performance in all systems * Windows performance-enhancing API...
Chapter 12, Improving Performance and Scalability with Multitasking, is about allowing multiple actions to occur at the same time to improve performance, scalability, and user productivity. Chapter 13...
The investigation produced detailed measurements in order to profile the application and help on improving the performance. Code analysis shows that the current IPOP version suffers from ...
Optimizing Java_Practical Techniques for ... You may argue with developers and architects about resource allocation, weighing the cost of devops against the time it takes to do the company’s work.
Title: Assessing and Improving Prediction and Classification: Theory and Algorithms in C++ Author: Timothy Masters Length: 517 pages Edition: 1st ed. Language: English Publisher: Apress Publication ...
With the concluding chapter on performance and scalability, this book will get you ready to use Entity Framework proficiently. What you will learn Master the technique of using sequence key ...
And with the growth of server virtualization and consolidation, high-volume data transactions, real-time technologies such as VoIP and video on demand, and the increase to 10GbE bandwidth, the ...