- 浏览: 223499 次
- 性别:
- 来自: 上海
文章分类
最新评论
-
Breather.杨:
斯库伊!受教
基于按annotation的hibernate主键生成策略 -
w420372197:
很详细,学习中..转载了
基于按annotation的hibernate主键生成策略 -
wslovenide:
...
基于按annotation的hibernate主键生成策略 -
Navee:
写的十分详细!感谢
基于按annotation的hibernate主键生成策略 -
eric.cheng:
很好,学习了
基于按annotation的hibernate主键生成策略
Scaling Twitter: Making Twitter 10000 Percent Faster
Thu, 01/17/2008 - 16:08 — Todd Hoff
Update 2: a commenter in Twitter Fails Macworld Keynote Test
said this entry needs to be updated. LOL. My uneducated guess is it's
not a language or architecture problem, but more a problem of not being
able to add hardware fast enough into their data center. The
predictability of this problem is debatable, but once you have it, it's
hard to fix.
Update: Twitter releases Starling
- light-weight persistent queue server that speaks the MemCache
protocol. It was built to drive Twitter's backend, and is in production
across Twitter's cluster.
Twitter started as a side project and blew up fast, going from 0 to millions of page views within a few terrifying months. Early design decisions that worked well in the small melted under the crush of new users chirping tweets to all their friends. Web darling Ruby on Rails was fingered early for the scaling problems, but Blaine Cook, Twitter's lead architect, held Ruby blameless:
For us, it’s really about scaling horizontally - to that end, Rails and Ruby haven’t been stumbling blocks, compared to any other language or framework. The performance boosts associated with a “faster” language would give us a 10-20% improvement, but thanks to architectural changes that Ruby and Rails happily accommodated, Twitter is 10000% faster than it was in January.
If Ruby on Rails wasn't to blame, how did Twitter learn to scale ever higher and higher?
Update: added slides Small Talk on Getting Big. Scaling a Rails App & all that Jazz
Site: http://twitter.com
Information Sources
The Platform
The Stats
The Architecture
- For example, if getting a count is slow, you can memoize the count into memcache in a millisecond.
- Getting your friends status is complicated. There are security and other issues. So rather than doing a query, a friend's status is updated in cache instead. It never touches the database. This gives a predictable response time frame (upper bound 20 msecs).
- ActiveRecord objects are huge so that's why they aren't cached. So they want to store critical attributes in a hash and lazy load the other attributes on access.
- 90% of requests are API requests. So don't do any page/fragment caching on the front-end. The pages are so time sensitive it doesn't do any good. But they cache API requests.
- Use message a lot. Producers produce messages, which are queued, and then are distributed to consumers. Twitter's main functionality is to act as a messaging bridge between different formats (SMS, web, IM, etc).
- Send message to invalidate friend's cache in the background instead of doing all individually, synchronously.
- Started with DRb , which stands for distributed Ruby. A library that allows you to send and receive messages from remote Ruby objects via TCP/IP. But it was a little flaky and single point of failure.
- Moved to Rinda , which a shared queue that uses a tuplespace model, along the lines of Linda. But the queues are persistent and the messages are lost on failure.
- Tried Erlang. Problem: How do you get a broken server running at Sunday Monday with 20,000 users waiting? The developer didn't know. Not a lot of documentation. So it violates the use what you know rule.
- Moved to Starling, a distributed queue written in Ruby.
- Distributed queues were made to survive system crashes by writing them to disk. Other big websites take this simple approach as well.
- They do a review and push out new mongrel servers. No graceful way yet.
- An internal server error is given to the user if their mongrel server is replaced.
- All servers are killed at once. A rolling blackout isn't used because the message queue state is in the mongrels and a rolling approach would cause all the queues in the remaining mongrels to fill up.
- A lot of down time because people crawl the site and add everyone as friends. 9000 friends in 24 hours. It would take down the site.
- Build tools to detect these problems so you can pinpoint when and where they are happening.
- Be ruthless. Delete them as users.
- Plan to partition in the future. Currently they don't. These changes have been enough so far.
- The partition scheme will be based on time, not users, because most requests are very temporally local.
- Partitioning will be difficult because of automatic memoization . They can't guarantee read-only operations will really be read-only. May write to a read-only slave, which is really bad.
- Their API is the most important thing Twitter has done.
- Keeping the service simple allowed developers to build on top of their infrastructure and come up with ideas that are way better than Twitter could come up with. For example, Twitterrific, which is a beautiful way to use Twitter that a small team with different priorities could create.
Lessons Learned
- Index everything. Rails won't do this for you.
- Use explain to how your queries are running. Indexes may not be being as you expect.
- Denormalize a lot. Single handedly saved them. For example, they store all a user IDs friend IDs together, which prevented a lot of costly joins.
- Avoid complex joins.
- Avoid scanning large sets of data.
- You want to know when you deploy an application that it will render correctly.
- They have a full test suite now. So when the caching broke they were able to find the problem before going live.
- Scale changes what can be stupid.
- Trying to load 3000 friends at once into memory can bring a server down, but when there were only 4 friends it works great.
Related Articles
发表评论
-
大型网站架构不得不考虑的10个问题
2009-01-16 14:41 1155大型网站架构不得不考虑的10个问题 来自CSDN:http:/ ... -
规划 SOA 参考架构
2009-01-07 16:22 2478规划 SOA 参考架构 2007-12-03 09: ... -
架构师书单
2009-01-07 16:09 1719架构师书单 一、S ... -
架构师之路
2009-01-07 16:07 5132架构师之路 什么是软件架构师? 架构 ... -
应用架构选型讨论
2008-12-10 09:29 1225应用架构选型讨论(PPT) ... -
系统构架设计应考虑的因素
2008-11-24 17:23 3249系统构架设计应考虑的 ... -
负载均衡--大型在线系统实现的关键(服务器集群架构的设计与选择)
2008-11-24 17:19 5719负载均衡--大型在 ... -
LinkedIn Architecture
2008-11-24 16:16 1617LinkedIn Architecture Category ... -
eBay Architecture
2008-11-24 16:14 1944eBay Architecture Tue, 05/27/2 ... -
LiveJournal Architecture
2008-11-24 16:13 1097LiveJournal Architecture Mon, ... -
Google Architecture
2008-11-24 16:09 1312Google Architecture Sun, 11/23 ... -
YouTube Architecture
2008-11-24 16:07 1537YouTube Architecture Thu, 03/1 ... -
Flickr Architecture
2008-11-24 16:05 1335Flickr Architecture Wed, 11/14 ... -
Digg Architecture
2008-11-24 16:03 1303Digg Architecture Mon, 09/15/2 ... -
37signals Architecture
2008-11-24 16:02 118937signals Architecture Thu, 09 ... -
Amazon Architecture
2008-11-24 15:58 1217Amazon Architecture Tue, 09/18 ... -
Facebook 海量数据处理
2008-11-24 15:54 1855Facebook 海量数据处理 作者: F ... -
Scalability Best Practices: Lessons from eBay
2008-11-24 15:50 1152Scalability Best Practices: Le ... -
Yapache-Yahoo! Apache 的秘密
2008-11-24 02:15 1193Yapache-Yahoo! Apache 的秘密 作 ... -
Notes from Scaling MySQL - Up or Out
2008-11-24 02:14 1502Notes from Scaling MySQL - Up o ...
相关推荐
### 扩展Twitter:从慢速到高效的关键步骤 #### 概述 “扩展Twitter”是一份关于如何针对高负载、大数据流量环境优化Twitter平台的技术资料。文档详细介绍了Twitter在成长过程中遇到的各种技术挑战以及应对策略。...
Scaling Software Agility: Best Practices for Large Enterprises part2
Scaling Software Agility: Best Practices for Large Enterprises part1.
Auto Scaling是亚马逊推出的弹性计算云(Amazon EC2)的一项Web服务,它能够根据用户设定的策略自动调整EC2实例的运行数量,以适应应用的负载变化。这项服务有助于维持应用的高可用性和扩展性,确保应用能够根据实际...
Docker-Scaling演示 该演示使用Docker容器实现了可扩展的Web服务器架构。 工具:docker,docker-compose 负载均衡器:HAProxy和Nginx 后端:一个简单的API 服务目录:领事 模板处理:Consul-Template 基于 入门...
在本实验中,我们将探讨如何利用Amazon Web Services (AWS) 的Elastic Load Balancing (ELB) 和 Auto Scaling 功能来构建一个弹性且高可用的基础设施。这两个服务是云架构的关键组成部分,它们确保了应用程序在面对...
(CVPR2024) Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild
Scaling_Sierpinski Name of Quantlet : Scaling_Sierpinski Published in : Metis Description : ' Sierpinski plots the Sierpinski triangle ' Keywords : scaling, topology, self-similar, mandelbrot, ...
`ImageScaling`项目就是一个专注于演示如何在C#中实现图像缩放功能的示例。在这个项目中,我们将探讨C#中处理图像的基本原理,包括加载图像、调整尺寸以及保存结果。 首先,C#中的`System.Drawing`命名空间提供了...
《MadGoat SSAA and Resolution Scaling 1.3:Unity中的高级抗锯齿与分辨率缩放技术》 在游戏开发领域,图像质量是吸引玩家的关键因素之一,而抗锯齿和分辨率缩放技术则直接关系到游戏画面的细腻度和流畅度。...
"scaling-funicular"项目,以其独特的"商业会议"为主题,为我们提供了一个创新的交互式体验平台。这个平台的核心在于其WorkAdventure Map,它是一种基于HTML技术的虚拟环境,旨在为参与者打造沉浸式的工作和学习体验...
整理的高性能高并发服务器架构文章,内容预览: ... Scaling Twitter: Making Twitter 10000 Percent Faster 331 Information Sources 332 The Platform 332 The Stats 333 The Architecture 333 L
### Addison.Wesley.Practices.for.Scaling.Lean.and.Agile.Development.Jan.2010 #### 核心知识点概述 《Addison.Wesley.Practices.for.Scaling.Lean.and.Agile.Development.Jan.2010》是一本专注于如何在大型、...
这是此视频后面的插值角度图基于角度图的插值DCCI 受到定向三次卷积插值的启发安装1下载资源库: $ git clone https://github.com/alexis-jacq/image_scaling.git2该库使用标准的CMake工作流程: $ mkdir build && ...
【战胜CMOS Scaling的研究挑战:半导体业发展方向】 随着信息技术的快速发展,半导体产业面临着前所未有的挑战。CMOS(互补金属氧化物半导体)Scaling是推动半导体技术进步的关键,它在过去几十年里遵循摩尔定律,...
Auto Scaling 可帮助确保您拥有适量的 Amazon EC2 实例来处理您的应用程序负载。您可创建 EC2 实例的 集合,称为 Auto Scaling 组 。您可以指定每个 Auto Scaling 组中最少的实例数量,Auto Scaling 会确保您的 组中...