Johannes Ernst's Blog
This week, I attended a very interesting presentation by Dan Pritchett and Randy Shoup, both senior technologists at eBay, on eBay's architecture. Some of it was as I would have expected, other things were, shall we say, counter-intuitive. Here is a random collection of notes, with some special exclamation marks:
- 212 million registered users, 1 billion photos
- 1 billion page views a day, 105 million listings, 2 petabytes of data, 3 billion API calls a month
- something like a factor of 35 in page views, e-mails sent, bandwidth from June 1999 to Q3/2006.
- 99.94% availability, measured as "all parts of site functional to everybody" vs. at least one part of a site not functional to some users somewhere
- 15,000 application servers, all J2EE. About 100 groups of functionality aka "apps". Notion of a "pool": "all the machines that deal with selling"... Well over 200 databases.
- Everything is planned with the question "what if load increases by 10x". Scaling only horizontal, not vertical: many parallel boxes.
- leverages MSXML framework for presentation layer (even in Java)
- Oracle databases, WebSphere Java (still 1.3.1)
- split databases by primary access path, modulo on a key
- every database has at least 3 on-line databases. Distributed over 8 data centers
- some database copies run 15 min behind, 4 hours behind
- no stored procedures. some very simple triggers.
- move cpu-intensive work moved out of the database layer to applications applications layer: referential integrity, joins, sorting done in the application layer! Reasoning: app servers are cheap, databases are the bottleneck.
- no client-side transactions. no distributed transactions
- J2EE: use servlets, JDBC, connection pools (with rewrite). Not much else.
- no state information in application tier. transient state maintained in cookie or scratch database
- app servers do not talk to each other -- strict layering of architecture
- Search, in 2002: 9 hours to update the index running on largest Sun box available -- not keeping up
- Average item on site changes its search data 5 times before it is sold (e.g. price), so real-time search results are extremely important.
- "Voyager": real-time feeder infrastructure built by eBay.. Uses reliable multicast from primary database to search nodes, in-memory search index, horizontal segmentation, N slices, load-balances over M instances, cache queries
There were way more questions by the packed audience of architects and other techies than there was time. Absolutely worth everybody's time.
Dan put the slides on his blog: eBaySDForum2006-11-29.pdf.
分享到:
相关推荐
描述:这份由eBay架构师Randy Shoup和Dan Pritchett在2006年11月29日的SDForum上展示的PPT,深入探讨了eBay如何在网站稳定性、功能更新速度、性能和成本之间找到平衡点。 知识点: 一、eBay面临的挑战 1. **庞大...
Dan Pritchett在其文章《BASE: An Acid Alternative》中提到,BASE(基本可用、软状态、最终一致性)与ACID(原子性、一致性、隔离性、持久性)原则相对立,它允许数据库在一段时间内处于不一致状态,但最终能够达到...
An intensive study of the reliability of Rhode Island Pupil Identification Scale 282 Psychology in the Schools, July, 1977, 1701. 14, No. 3. REFERENCES BANNATYNE, A. Programs, materials, and ...
BASE理论是大型互联网系统中分布式设计的重要原则,它源自eBay架构师Dan Pritchett在2008年发表的文章。BASE代表“基本可用”(Basically Available)、“软状态”(Soft-state)和“最终一致性”(Eventually ...
CAP定理的流行得益于多篇重要论文的发表,例如Google的Bigtable(2006)和Amazon的Dynamo(2007),以及Werner Vogels(亚马逊的CTO)和Dan Pritchett(eBay的架构师)的推介文章。 在实践中,传统数据库保证了一致...
An intensive study of the reliability of Rhode Island Pupil Identification Scale 282 Psychology in the Schools, July, 1977, 1701. 14, No. 3. REFERENCES BANNATYNE, A. Programs, materials, and ...
BASE理论由eBay的架构师Dan Pritchett提出,强调的是基本可用性、软状态和最终一致性。在分布式系统中,基本可用性意味着在出现故障时,系统允许损失部分可用性以保证核心服务。软状态表示系统的状态不需要立即同步...
Pritchett,1976 年)和 Peabody 个人成就测试(Dunn 和 Markwardt,1970 年)。 皮博迪个人成绩测试 (PI AT) 旨在提供对学业成绩的一般筛选。 它产生一个总成绩分数和五个子测试分数:数学、阅读识别、阅读理解、...
此外,文献还引用了多部关于并购与公司治理的专业著作,如Rankine的《收购失败》、Reily等人的《投资分析与组合管理》、Weston等人的《接管、重组与公司治理》、Gaughan的《兼并、收购与公司重组》以及Pritchett等人...
Pritchett,1972 年)作为儿童的个人诊断工具正在获得广泛接受。 KM 不需要广泛的正式培训和考试管理经验,使其成为希望让教师和辅助专业人员参与评估的学校心理学家的有用筛选工具。 十四个子测试分为数学内容...