refer to : http://faler.wordpress.com/2009/05/08/best-practic
The best practices for high performance websites are well documented here, so I won’t go into web specifically, although these practices are equally applicable as a supplement for websites.
So here goes, 8 tips for making your system high performant and scalable:
Offload the database
What is the most common bottleneck in most applications? The relational database and access to it has been the bottleneck without fail on most projects I have done in the past. The solution to this?
Avoid hitting the database, and avoid opening transactions or connections unless you absolutely need to use them, this means that the popular Open Session in View pattern, while convenient is often a performance hog in applications severely limiting performance and scalability (but there are ways around it without sacrificing any of the convenience).
“What a difference a cache makes”
So how do you offload the database if you need to access a lot of data? For a mostly read-only application this is simple, add lots of caching. This might even work reasonably well for a read-write application if you have clear hooks and ways to expire caches when needed.
When it comes to caching, there is a clear hierarchy of efficiency: in-memory is faster than on-disk, on-disk is considerably faster than a remote network location or a relational database.
Cache as coarse-grained objects as possible
If possible, cache objects/data on the most coarse-grained level possible – even if at the fine-grained level objects are cached, a more coarse-grained approach will save CPU and time required to interogate n number of cache zones rather than a single cache zone, furthermore, retrieving a full object graph saves time assembling the object graph.
Don’t store transient state permanently
There is a tendency in many projects to put absolutely all data in a database. But is it always absolutely necessary? Do you really need to store login session information in a database? Are you storing transient state, or necessary business data?
The “state monster” is a dangerous beast. As a rule of thumb, only store actual, necessary, critical and actionable business data in permanent storage (database, disk) and nothing else.
Location, Location – put things close to where they are supposed to be delivered
A colleague of mine made a good analogy not long ago: if you know you are going to move your heavy cupboard onto a delivery truck tomorrow morning, it is better to put it in the hallway close to the front door, rather than down in the basement at the back of the house.
This is exactly why Content Delivery Networks work for websites, but it is applicable on an application and infrastructure level as well: If you need to hop through a load-balancer, web server, application server and database server, using precious resources in all tiers to retrieve data, rather than just a load-balancer and web server, your scalability and performance will suffer.
Constrain concurrent access to limited resources
Imagine you have a cache-miss despite all your caching, and have to go off and do an expensive, calculation intensive retrieval of data all the way back to a database, further imagine that 30 other clients want the exact same data before the cache has been primed. You will find yourself in a situation where potentially 30 clients go off and retrieve the same data through the same expensive operation at the same time. What a waste!
A simple way to solve this problem is to have only have the first client go off and do the expensive calculations, while having the other clients simply wait for the first clients result (and share that result). In Java, a pattern like this can be easily implemented in about 50 lines of code with the help of a CountDownLatch, ExecutorService and some custom code.
Constraining access to limited resources in this way is not only applicable to read-only access, it is equally usable for transactional write operations, which can be achieved either through having a separate “write-behind” process, or the use of asynchronous messaging such as JMS (effectively variations of the same thing).
99% of the time it is quicker to let a single thread do the work and finish, rather than flooding finite resources with 200 client threads.
Staged, asynchronous processing
This is an extention of the approach described above: separating a process through asynchronicity into discrete, separate steps separated by queues and executed by a limited number of workers/threads in each step will quite often do wonders for both scalability and performance, furthermore it minimizes the risk of a system being overloaded and crashing – an application may slow down it the time it takes to finish a task put on a task queue if the workload is heavy, but it most likely wont crash.
These are the exact reasons why “Staged Event Driven Architectures” (such as Zeus) can create http servers that can handle a concurrency amount limited by OS sockets rather than available threads (imagine 10000 concurrent clients instead of 150-200), and the reason why Erlang and Scala’s “Actor” concurrency model are so popular in certain large scale environments such as banks and telecoms systems.
Minimize network chatter
Going outside of the runtime of your application is slow, communicating with a remote application is slower than communicating with in-memory objects in the same runtime, networks are generally less dependable than RAM and definitely have higher latency. Avoid remote communication if you can, and definitely do what you can to avoid making your application too “chatty”. Of course sometimes there is immense benefit, and yes, even performance and scalability benefits in distributed state achieved by network communication – say if your application needs to constantly interrogate a registry of available services on a network, it makes sense that that registry is distributed on all available nodes.
But in general, the trade-offs of network communication need to be acknowledged and carefully weighed up, so if it is not absolutely necessary, avoid excessive network communication and traffic.
I might be missing one or two other critical performance points, and this is by no means meant to be a definite, exhaustive list, but it is nonetheless a list that contains very common performance and scalability “gotchas”. If you have more, fill in the comments!
分享到:
相关推荐
High Performance Spark Best Practices for Scaling and Optimizing Apache Spark 英文epub 本资源转载自网络,如有侵权,请联系上传者或csdn删除 本资源转载自网络,如有侵权,请联系上传者或csdn删除
High Performance Spark Best Practices for Scaling and Optimizing Apache Spark 英文azw3 本资源转载自网络,如有侵权,请联系上传者或csdn删除 本资源转载自网络,如有侵权,请联系上传者或csdn删除
High Performance Spark: Best Practices for Scaling and Optimizing Apache Spark by Holden Karau English | 25 May 2017 | ASIN: B0725YT69J | 358 Pages | AZW3 | 3.09 MB Apache Spark is amazing when ...
High Performance Spark Best Practices for Scaling and Optimizing Apache Spark 英文无水印pdf pdf所有页面使用FoxitReader和PDF-XChangeViewer测试都可以打开 本资源转载自网络,如有侵权,请联系上传者或...
在《Big Data Principles and Best Practices of Scalable Real-Time Data Systems》一书中,作者Nathan Marz和James Warren深入探讨了构建高效、可靠的大数据处理系统的原理和方法。本书针对的是那些需要处理大量...
"High Performance Spark Best Practices for Scaling and Optimizing Apache Spark" 这一主题深入探讨了如何最大化利用Spark的性能,以及如何进行有效扩展和优化。以下是一些关键的知识点: 1. **资源管理与调度**...
Chapter 9: Best Practices for Function-Based Views Chapter 10: Best Practices for Class-Based Views Chapter 11: Form Fundamentals Chapter 12: Common Patterns for Forms Chapter 13: Templates: Best ...
Java 9 High Performance 英文epub 本资源转载自网络,如有侵权,请联系上传者或csdn删除 本资源转载自网络,如有侵权,请联系上传者或csdn删除
NHTSA_Cybersecurity Best Practices for Modern Vehicles
Best Practices for Performance Sending Operations to Multiple Threads
Souders的上一本畅销书《高性能网站建设指南》(High Performance Web Sites)震惊了Web开发界,它揭示了在客户端加载一个网页的时间大约占用了总时耗的80%。在《高性能网站建设进阶指南》(Even Faster Web Sites...
Defensive Security Handbook: Best Practices for Securing Infrastructure by Lee Brotherston English | 3 Apr. 2017 | ASIN: B06Y18XC5Y | 268 Pages | AZW3 | 3.88 MB Despite the increase of high-profile ...
AWS Best Practices for DDoS Resiliency,是基于AWS上安全服务shield和WAF ALB和cloudfont已经Route53构建适合不同应用部署架构的防拒绝服务攻击的文档,可以帮助用户设计适合的防护机制。
根据文件信息,书籍《Big Data - PRINCIPLES AND BEST PRACTICES OF SCALABLE REAL-TIME DATA SYSTEMS》由Nathan Marz与James Warren共同撰写,由Manning Publications Co.出版。本书深入探讨了大规模实时数据系统的...
Best Practices for Upgrades to Oracle Database 11g Release 2 CN
Building Software Teams: Ten Best Practices for Effective Software Development English | 31 Dec. 2016 | ISBN: 149195177X | 136 Pages | AZW3/MOBI/EPUB/PDF (conv) | 6.49 MB Why does poor software ...
### SAP BW on DB2 UDB for z/OS V8: Best Practices Overview #### Introduction The IBM Redbook titled "Best Practices for SAP Business Information Warehouse (BW) on DB2 UDB for z/OS V8" provides ...
Learning Spark Streaming Best Practices for Scaling and Optimizing Apache Spark(Early Release) 英文无水印pdf pdf所有页面使用FoxitReader和PDF-XChangeViewer测试都可以打开 本资源转载自网络,如有侵权...