From: http://www.manageability.org/blog/stuff/about-ebays-architecture
An accurate way of knowing what really works is looking at what truly works in practice. The software industry is plagued with so many ideas that for all intents and purposes are purely theoretical. Compounding the problem is the fact the software vendors continue to praise and sell these ideas as best practices.
Massively scalable architectures is one area where not many practitioners have truly been a witness of. Fortunately, sometimes information is graciously released for all to see and hear. I gained a lot of wisdom reading about Google's design of its hardware infrastructure or even Yahoo's page rendering patent. Now, another internet behemoth, eBay, has provided us with some insight on its own architecture.
There are many pieces of information in this presentation, however, I'll try to highlight and comment on the ones that are unusual or interesting.
The impressive part is that eBay had 380M page views a day with a site availability of 99.92%. In addition to that, nearly 30K lines of code changes per week. Just plain and simply enviable, not only that, incontrovertible evidence of the scalability of Java.
Now for the details on how it was achieved using J2EE technologies. The highlights to Ebay's scalability is as follows:
- Judicious use of server-side state
- No server affinity
- Functional server pools
- Horizontal and vertical database partitioning
What's interesting is how eBay enables data access scalability. They mention the use of "custom O-R mapping" with support for features like caching (local and global), lazy loading, fetch sets (deep and shallow) and support for retrieval and submit update subsets. Furthemore, they use bean managed transaction exclusively, autocommited to the database, and use the O-R mapping to route to different data sources.
A couple of things are quite striking. The first is its complete lack of usage of Entity Beans, using its own O-R mapping solution (Hibernate anyone?). The second is the partitioning of application servers based on use-cases. The third, the partitioning also of databases is also based on use-cases. The last is the stateless nature of the system and the conspicuous absence of clustering technologies.
Here's the quote about server state:
This basically means that right now we are not really using server-side state. We may use it; right now we have not found a good reason to use it. [snip] if there is something that needs to be stateful, then we put in the database; we go back and get it, if we need to. We just take the hit. We do not have to do clustering; we do not have to do any of that stuff.
In short, save yourself the trouble of building stateful servers, furthermore forget about clustering, you simply may not need it. Now, read this about functional partitioning:
So we have a pool or a farm of machines that are dedicated to a specific use case; like search will have its own farm of machines, and we can tune those much differently because the footprint and the replay of those are much different than viewing an item, which is essentially a read-only use case, versus selling an item, which is read-mostly type of use case. [snip] Horizontal database partitioning is something that we have adopted in the last probably four or five years to really get the availability, and also scalability, that we need.
In short, forget about placing your application and database on one giant machine, just use pools of servers that are dedicated on a use case basis. Doesn't that sound awfully similar to Google's strategy?
A little bit more about horizontal partitioning:
What enables our horizontal scalability is content based routing. So, if imagine eBay has on any given day 60 million items. We do not want to store that in one behemoth Sun machine. [snip] let us scale it across; may be, many Sun machines, but how you get to the right one? There is the content-based routing idea that comes in play. So, the idea was that given some hint, find out which of my 20 physical database hosts do I need to go to. The other cool thing about this is that failover could be defined.
Finally a word about using a more loosely coupled architecture in the future:
Using messaging to actually decouple disparate use cases is something that we are investigating.
Isn't it strange that the original presentation was about J2EE Design Patterns? The key scalability ideas are only tangentially related to the Patterns. Yes, eBay does use patterns to structure their code, however, focusing on the patterns misses the entire picture. The key nuggets of wisdom are a stateless design, the use of a flexible and highly tuned OR-mapping layer and the partitioning of servers based on use cases. The design patterns are nice, however don't expect blind application of it to lead to scalability.
In general, the approach that eBay is alluding to (and Google has confirmed) is that architectures that consist of pools or farms of machines dedicated on a use-case basis will provide better scalability and availability as compared to a few behemoth machines. The vendors, of course, are gripped in fear about this conclusion for obvious reasons. Nevertheless, the biggest technical hurdle in deploying a large number of servers is, of course, none other than the need for manageability ;-)
分享到:
相关推荐
Thoughts on Recent Development of ITS.PDF
### 差分方程符号表示法:函数式与经典式的思考 #### 概述 本文档探讨了在处理微分方程时,两种不同的符号表示方法:经典的表示法与现代的功能性表示法,并分析了这两种表示法之间的差异、优缺点以及应用场景。...
Eric-Hoffer-The-true-believer_-Thoughts-on-the-nature-of-mass-movements 英文原版书
此文档来自于Fabio Kung在DockerCon中演讲的内容。
根据给定的文件信息,我们可以总结出以下关于Google Android的关键知识点: ### 1. Android的分层架构 #### 1.1.1 红色层(The Red Layer) - **定义**:红色层代表了由Linux内核及关联的GNU工具包提供的服务。...
**模型推理思想树:Tree-of-thoughts** 在人工智能领域,模型推理是让机器学习模型理解问题、生成解决方案的关键步骤。"Tree-of-thoughts" 是一种创新的框架,旨在优化这一过程,尤其是在大型语言模型中。这个项目...
cd thoughts-on-code sudo npm i 写作 将您的markdown文件放在docs/ ,并以章节编号作为前缀。 docs/ 1_chapter 1.md 1_chapter 2.md 然后使用make编译它们。 发展 如果您想更改页面的样式或脚本,可以使用gulp...
Report on: some practical techniques for the construction of research papers; the mechanics of paper writing.
servicemix 作者对于servicemix 发展过程及jbi选择的概述
Further Thoughts on Diverse and Occasionally Related Matters That Will Prove of Interest to Software Developers, Designers, and Managers, and to Those Who, Whether by Good Fortune or Ill Luck, Work ...
在Android开发中,Google定位和地图服务是相当关键的一部分,特别是在构建导航、位置跟踪或提供本地服务的应用中。本文将深入探讨如何实现“Google定位最终解决方案”,并关注于室内定位和连续定位。...
Thoughts是一款开源的电子日记应用程序,它基于.NET Framework并采用C#编程语言开发。这款工具为用户提供了方便的日记记录功能,支持通过Web服务在本地和远程保存日记条目,同时也可以直接保存到数据库中,确保了...
constructive_thoughts
Glance-Android Glance is a RSVP reader for Android 4.0.4 (API 15). Glance also includes a curated collection of feeds for your reading pleasure ...Some quick, incomplete thoughts on what's next. Enh
《即兴 NO.63 thoughts》是一首深受音乐爱好者喜爱的钢琴曲,其双手数字简谱为演奏者提供了直观易懂的乐谱形式。在学习和演奏这首曲子时,有几个重要的音乐理论和技巧值得深入探讨。 首先,了解简谱的基本构成至关...
"random_thoughts_api"是一个基于JavaScript开发的API,它为用户提供了一种生成随机想法或概念的方法。然而,根据描述,这个API目前存在一些需要解决的问题。首先,它在Firefox浏览器上显示异常,表现为文本从底部掉...
【阿里云】在2017年的The Computing Conference - Hangzhou峰会上,Minds + Machines Group Limited(MMX)发布了一份名为"Exotic Thoughts"的报告,深入探讨了海外新gTLD(通用顶级域名)市场的异域见解。...
思想在上运行的Micronotes PWA去做: 清理并创建适当的组件专注于添加野生动物园vuex,vuex-persist Firebase身份验证+存储过渡编辑思想存档(与完整删除相比)项目设置npm install编译和热重装以进行开发npm run ...
### Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models #### 概述 "Buffer of Thoughts"(简称BoT)是一种新颖且多用途的思想增强推理方法,旨在提高大型语言模型(LLMs)在准确度、...