YouTube Architecture

masterkey

浏览: 340404 次
性别:
来自: 北京

最近访客更多访客>>

cutesunshineriver

春春哥哥

hejianhua66

jobury

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

技术

Youtube Python lighttpd Linux MySQL

也是采用Apache,Mysql,Python,Linux,Lighttpd等开源方案

可支持1亿/日的视频访问（100 million videos per day)

创立时间 2/2005

3/2006 3000w/日

7/2006 1亿/天
如下原文：

Update: YouTube: The Platform. YouTube adds a new rich set of APIs in order to become your video platform leader--all for free. Upload, edit, watch, search, and comment on video from your own site without visiting YouTube. Compose your site internally from APIs because you'll need to expose them later anyway.

YouTube grew incredibly fast, to over 100 million video views per day, with only a handful of people responsible for scaling the site. How did they manage to deliver all that video to all those users? And how have they evolved since being acquired by Google?

Information Sources

Google Video

Platform

Apache

Python

Linux (SuSe)

MySQL

psyco, a dynamic python->C compiler

lighttpd for video instead of Apache

What's Inside?

The Stats

Supports the delivery of over 100 million videos per day.

Founded 2/2005

3/2006 30 million video views/day

7/2006 100 million video views/day

2 sysadmins, 2 scalability software architects

2 feature developers, 2 network engineers, 1 DBA

Recipe for handling rapid growth

while (true)
{
identify_and_fix_bottlenecks();
drink();
sleep();
notice_new_bottleneck();
}

This loop runs many times a day.

Web Servers

NetScalar is used for load balancing and caching static content.

Run Apache with mod_fast_cgi.

Requests are routed for handling by a Python application server.

Application server talks to various databases and other informations sources to get all the data and formats the html page.

Can usually scale web tier by adding more machines.

The Python web code is usually NOT the bottleneck, it spends most of its time blocked on RPCs.

Python allows rapid flexible development and deployment. This is critical given the competition they face.

Usually less than 100 ms page service times.

Use psyco, a dynamic python->C compiler that uses a JIT compiler approach to optimize inner loops.

For high CPU intensive activities like encryption, they use C extensions.

Some pre-generated cached HTML for expensive to render blocks.

Row level caching in the database.

Fully formed Python objects are cached.

Some data are calculated and sent to each application so the values are cached in local memory. This is an underused strategy. The fastest cache is in your application server and it doesn't take much time to send precalculated data to all your servers. Just have an agent that watches for changes, precalculates, and sends.

Video Serving

Costs include bandwidth, hardware, and power consumption.

Each video hosted by a mini-cluster. Each video is served by more than one machine.

Using a a cluster means:
- More disks serving content which means more speed.
- Headroom. If a machine goes down others can take over.
- There are online backups.

Servers use the lighttpd web server for video:
- Apache had too much overhead.
- Uses epoll to wait on multiple fds.
- Switched from single process to multiple process configuration to handle more connections.

Most popular content is moved to a CDN (content delivery network):
- CDNs replicate content in multiple places. There's a better chance of content being closer to the user, with fewer hops, and content will run over a more friendly network.
- CDN machines mostly serve out of memory because the content is so popular there's little thrashing of content into and out of memory.

Less popular content (1-20 views per day) uses YouTube servers in various colo sites.
- There's a long tail effect. A video may have a few plays, but lots of videos are being played. Random disks blocks are being accessed.
- Caching doesn't do a lot of good in this scenario, so spending money on more cache may not make sense. This is a very interesting point. If you have a long tail product caching won't always be your performance savior.
- Tune RAID controller and pay attention to other lower level issues to help.
- Tune memory on each machine so there's not too much and not too little.

Serving Video Key Points

Keep it simple and cheap.

Keep a simple network path. Not too many devices between content and users. Routers, switches, and other appliances may not be able to keep up with so much load.

Use commodity hardware. More expensive hardware gets the more expensive everything else gets too (support contracts). You are also less likely find help on the net.

Use simple common tools. They use most tools build into Linux and layer on top of those.

Handle random seeks well (SATA, tweaks).

Serving Thumbnails

Surprisingly difficult to do efficiently.

There are a like 4 thumbnails for each video so there are a lot more thumbnails than videos.

Thumbnails are hosted on just a few machines.

Saw problems associated with serving a lot of small objects:
- Lots of disk seeks and problems with inode caches and page caches at OS level.
- Ran into per directory file limit. Ext3 in particular. Moved to a more hierarchical structure. Recent improvements in the 2.6 kernel may improve Ext3 large directory handling up to 100 times, yet storing lots of files in a file system is still not a good idea.
- A high number of requests/sec as web pages can display 60 thumbnails on page.
- Under such high loads Apache performed badly.
- Used squid (reverse proxy) in front of Apache. This worked for a while, but as load increased performance eventually decreased. Went from 300 requests/second to 20.
- Tried using lighttpd but with a single threaded it stalled. Run into problems with multiprocesses mode because they would each keep a separate cache.
- With so many images setting up a new machine took over 24 hours.
- Rebooting machine took 6-10 hours for cache to warm up to not go to disk.

To solve all their problems they started using Google's BigTable, a distributed data store:
- Avoids small file problem because it clumps files together.
- Fast, fault tolerant. Assumes its working on a unreliable network.
- Lower latency because it uses a distributed multilevel cache. This cache works across different collocation sites.
- For more information on BigTable take a look at Google Architecture, GoogleTalk Architecture, and BigTable.

Databases

The Early Years
- Use MySQL to store meta data like users, tags, and descriptions.
- Served data off a monolithic RAID 10 Volume with 10 disks.
- Living off credit cards so they leased hardware. When they needed more hardware to handle load it took a few days to order and get delivered.
- They went through a common evolution: single server, went to a single master with multiple read slaves, then partitioned the database, and then settled on a sharding approach.
- Suffered from replica lag. The master is multi-threaded and runs on a large machine so it can handle a lot of work. Slaves are single threaded and usually run on lesser machines and replication is asynchronous, so the slaves can lag significantly behind the master.
- Updates cause cache misses which goes to disk where slow I/O causes slow replication.
- Using a replicating architecture you need to spend a lot of money for incremental bits of write performance.
- One of their solutions was prioritize traffic by splitting the data into two clusters: a video watch pool and a general cluster. The idea is that people want to watch video so that function should get the most resources. The social networking features of YouTube are less important so they can be routed to a less capable cluster.

The later years:
- Went to database partitioning.
- Split into shards with users assigned to different shards.
- Spreads writes and reads.
- Much better cache locality which means less IO.
- Resulted in a 30% hardware reduction.
- Reduced replica lag to 0.
- Can now scale database almost arbitrarily.

Data Center Strategy

Used manage hosting providers at first. Living off credit cards so it was the only way.

Managed hosting can't scale with you. You can't control hardware or make favorable networking agreements.

So they went to a colocation arrangement. Now they can customize everything and negotiate their own contracts.

Use 5 or 6 data centers plus the CDN.

Videos come out of any data center. Not closest match or anything. If a video is popular enough it will move into the CDN.

Video bandwidth dependent, not really latency dependent. Can come from any colo.

For images latency matters, especially when you have 60 images on a page.

Images are replicated to different data centers using BigTable. Code
looks at different metrics to know who is closest.

Lessons Learned

Stall for time. Creative and risky tricks can help you cope in the short term while you work out longer term solutions.

Prioritize. Know what's essential to your service and prioritize your resources and efforts around those priorities.

Pick your battles. Don't be afraid to outsource some essential services. YouTube uses a CDN to distribute their most popular content. Creating their own network would have taken too long and cost too much. You may have similar opportunities in your system. Take a look at Software as a Service for more ideas.

Keep it simple! Simplicity allows you to rearchitect more quickly so you can respond to problems. It's true that nobody really knows what simplicity is, but if you aren't afraid to make changes then that's a good sign simplicity is happening.

Shard. Sharding helps to isolate and constrain storage, CPU, memory, and IO. It's not just about getting more writes performance.

Constant iteration on bottlenecks:
- Software: DB, caching
- OS: disk I/O
- Hardware: memory, RAID

You succeed as a team. Have a good cross discipline team that understands the whole system and what's underneath the system. People who can set up printers, machines, install networks, and so on. With a good team all things are possible.

1
顶

4
踩

分享到：

Audiogalaxy high performance MySQL sear ... | Exceptional Performance : Best Practices ...

2008-03-13 16:20
浏览 1967
评论(0)
分类:非技术
查看更多

发表评论

您还没有登录,请您登录后再发表评论

相关推荐

智能车竞赛介绍（竞赛目标和赛程安排）.zip: 全国大学生智能汽车竞赛自2006年起，由教育部高等教育司委托高等学校自动化类教学指导委员会举办，旨在加强学生实践、创新能力和培养团队精神的一项创意性科技竞赛。该竞赛至今已成功举办多届，吸引了众多高校学生的积极参与，此文件为智能车竞赛介绍

集字卡v4.3.4微信公众号原版三种UI+关键字卡控制+支持强制关注.zip: 字卡v4.3.4 原版三种UI+关键字卡控制+支持获取用户信息+支持强制关注集卡模块从一开始的版本到助力版本再到现在的新规则版本。集卡模块难度主要在于如何控制各种不同的字卡组合被粉丝集齐的数量。如果不控制那么一定会出现超过数量的粉丝集到指定的字卡组合，造成奖品不够的混乱，如果大奖价值高的话，超过数量的粉丝集到大奖后，就造成商家的活动费用超支了。我们冥思苦想如何才能限制集到指定字卡组合的粉丝数，后我们想到了和支付宝一样的选一张关键字卡来进行规则设置的方式来进行限制，根据奖品所需的关键字卡数，设定规则就可以控制每种奖品所需字卡组合被粉丝集到的数量，规则可以在活动进行中根据需要进行修改，活动规则灵活度高。新版的集卡规则，在此次政府发布号的活动中经受了考验，集到指定字卡组合的粉丝没有超出规则限制。有了这个规则限制后，您无需盯着活动，建好活动后就无人值守让活动进行就行了，您只需要时不时来看下蹭蹭上涨的活动数据即可。被封？无需担心，模块内置有防封功能，支持隐藏主域名，显示炮灰域名，保护活动安全进行。活动准备？只需要您有一个认证服务号即可，支持订阅号借用认证服务号来做活动。如果您

出口设备线体程序详解：PLC通讯下的V90控制与开源FB284工艺对象实战指南,出口设备线体程序详解：PLC通讯与V90控制集成，工艺对象与FB284协同工作，开源学习V90控制技能,出口设备1200: 出口设备线体程序详解：PLC通讯下的V90控制与开源FB284工艺对象实战指南,出口设备线体程序详解：PLC通讯与V90控制集成，工艺对象与FB284协同工作，开源学习V90控制技能,出口设备1200线体程序，多个plc走通讯，内部有多个v90,采用工艺对象与fb284 共同控制，功能快全部开源，能快速学会v90的控制 ,出口设备; 1200线体程序; PLC通讯; 多个V90; 工艺对象; FB284; 功能开源; V90控制。,V90工艺控制：开源功能快，快速掌握1200线体程序与PLC通讯

基于Arduino与DAC8031的心电信号模拟器资料：心电信号与正弦波的双重输出应用方案,Arduino与DAC8031心电信号模拟器：生成心电信号与正弦波输出功能详解,基于arduino +DAC: 基于Arduino与DAC8031的心电信号模拟器资料：心电信号与正弦波的双重输出应用方案,Arduino与DAC8031心电信号模拟器：生成心电信号与正弦波输出功能详解,基于arduino +DAC8031的心电信号模拟器资料，可输出心电信号，和正弦波 ,基于Arduino;DAC8031;心电信号模拟器;输出心电信号;正弦波输出;模拟器资料,基于Arduino与DAC8031的心电信号模拟器：输出心电与正弦波

（参考项目）MATLAB口罩识别检测.zip: MATLAB口罩检测的基本流程图像采集：通过摄像头或其他图像采集设备获取包含面部的图像。图像预处理：对采集到的图像进行灰度化、去噪、直方图均衡化等预处理操作，以提高图像质量，便于后续的人脸检测和口罩检测。人脸检测：利用Haar特征、LBP特征等经典方法或深度学习模型（如MTCNN、FaceBoxes等）在预处理后的图像中定位人脸区域。口罩检测：在检测到的人脸区域内，进一步分析是否佩戴口罩。这可以通过检测口罩的边缘、纹理等特征，或使用已经训练好的口罩检测模型来实现。结果输出：将检测结果以可视化方式展示，如在图像上标注人脸和口罩区域，或输出文字提示是否佩戴口罩。

kernel-debug-devel-3.10.0-1160.119.1.el7.x64-86.rpm.tar.gz: 1、文件内容：kernel-debug-devel-3.10.0-1160.119.1.el7.rpm以及相关依赖 2、文件形式：tar.gz压缩包 3、安装指令： #Step1、解压 tar -zxvf /mnt/data/output/kernel-debug-devel-3.10.0-1160.119.1.el7.tar.gz #Step2、进入解压后的目录，执行安装 sudo rpm -ivh *.rpm 4、更多资源/技术支持：公众号禅静编程坊

day02供应链管理系统-补充.zip: 该文档提供了一个关于供应链管理系统开发的详细指南，重点介绍了项目安排、技术实现和框架搭建的相关内容。文档分为以下几个关键部分：项目安排：主要步骤包括搭建框架（1天），基础数据模块和权限管理（4天），以及应收应付和销售管理（5天）。供应链概念：供应链系统的核心流程是通过采购商品放入仓库，并在销售时从仓库提取商品，涉及三个主要订单：采购订单、销售订单和调拨订单。大数据的应用：介绍了数据挖掘、ETL（数据抽取）和BI（商业智能）在供应链管理中的应用。技术实现：讲述了DAO（数据访问对象）的重用、服务层的重用、以及前端JS的继承机制、jQuery插件开发等技术细节。系统框架搭建：包括Maven环境的配置、Web工程的创建、持久化类和映射文件的编写，以及Spring配置文件的实现。 DAO的需求和功能：供应链管理系统的各个模块都涉及分页查询、条件查询、删除、增加、修改操作等需求。泛型的应用：通过示例说明了在Java语言中如何使用泛型来实现模块化和可扩展性。文档非常技术导向，适合开发人员参考，用于构建供应链管理系统的架构和功能模块。

基于四旋翼无人机的PD控制研究附Matlab代码.rar: 1.版本：matlab2014/2019a/2024a 2.附赠案例数据可直接运行matlab程序。 3.代码特点：参数化编程、参数可方便更改、代码编程思路清晰、注释明细。 4.适用对象：计算机，电子信息工程、数学等专业的大学生课程设计、期末大作业和毕业设计。

C#与VB实现欧姆龙PLC的Fins TCP通信案例源码：调用动态链接库进行数据读写，定时器与计数器数据区的简洁读写操作示例,C#与VB实现欧姆龙PLC的Fins TCP通信案例源码：调用动态链接库进: C#与VB实现欧姆龙PLC的Fins TCP通信案例源码：调用动态链接库进行数据读写，定时器与计数器数据区的简洁读写操作示例,C#与VB实现欧姆龙PLC的Fins TCP通信案例源码：调用动态链接库进行读写操作，涵盖定时器计数器数据区学习案例,C#欧姆龙plc Fins Tcp通信案例上位机源码，有c#和VB的Demo，c#上位机和欧姆龙plc通讯案例源码,调用动态链接库，可以实现上位机的数据连接，可以简单实现D区W区定时器计数器等数据区的读写，是一个非常好的学习案例 ,C#; 欧姆龙PLC; Fins Tcp通信; 上位机源码; 动态链接库; 数据连接; D区W区读写; 定时器计数器; 学习案例,C#实现欧姆龙PLC Fins Tcp通信上位机源码，读写数据区高效学习案例

可调谐石墨烯超材料吸收体的FDTD仿真模拟研究报告：吸收光谱的化学势调节策略与仿真源文件解析,可调谐石墨烯超材料吸收体：化学势调节光谱的FDTD仿真模拟研究,可调谐石墨烯超材料吸收体FDTD仿真模拟: 可调谐石墨烯超材料吸收体的FDTD仿真模拟研究报告：吸收光谱的化学势调节策略与仿真源文件解析,可调谐石墨烯超材料吸收体：化学势调节光谱的FDTD仿真模拟研究,可调谐石墨烯超材料吸收体FDTD仿真模拟【案例内容】该案例提供了一种可调谐石墨烯超材料吸收体，其吸收光谱可以通过改变施加于石墨烯的化学势来进行调节。【案例文件】仿真源文件 ,可调谐石墨烯超材料吸收体; FDTD仿真模拟; 化学势调节; 仿真源文件,石墨烯超材料吸收体：FDTD仿真调节吸收光谱案例解析

RBF神经网络控制仿真-第二版: RBF神经网络控制仿真-第二版

松下PLC与威纶通触摸屏转盘设备控制：FPWINPRO7与EBPRO智能编程与宏指令应用,松下PLC与威纶通触摸屏转盘设备控制解决方案：FPWINPRO7与EBPRO协同工作，实现多工位转盘加工与IE: 松下PLC与威纶通触摸屏转盘设备控制：FPWINPRO7与EBPRO智能编程与宏指令应用,松下PLC与威纶通触摸屏转盘设备控制解决方案：FPWINPRO7与EBPRO协同工作，实现多工位转盘加工与IEC编程模式控制,松下PLC+威纶通触摸屏的转盘设备松下PLC工程使用程序版本为FPWINPRO7 7.6.0.0版本威纶通HMI工程使用程序版本为EBPRO 6.07.02.410S 1.多工位转盘加工控制。 2.国际标准IEC编程模式。 3.触摸屏宏指令应用控制。 ,松下PLC; 威纶通触摸屏; 转盘设备控制; 多工位加工控制; IEC编程模式; 触摸屏宏指令应用,松下PLC与威纶通HMI联控的转盘设备控制程序解析

基于循环神经网络（RNN）的多输入单输出预测模型（适用于时间序列预测与回归分析，需Matlab 2021及以上版本）,基于循环神经网络（RNN）的多输入单输出预测模型（matlab版本2021+），真: 基于循环神经网络（RNN）的多输入单输出预测模型（适用于时间序列预测与回归分析，需Matlab 2021及以上版本）,基于循环神经网络（RNN）的多输入单输出预测模型（matlab版本2021+），真实值与预测值对比，多种评价指标与线性拟合展示。,RNN预测模型做多输入单输出预测模型，直接替数据就可以用。程序语言是matlab，需求最低版本为2021及以上。程序可以出真实值和预测值对比图，线性拟合图，可打印多种评价指标。 PS:以下效果图为测试数据的效果图，主要目的是为了显示程序运行可以出的结果图，具体预测效果以个人的具体数据为准。 2.由于每个人的数据都是独一无二的，因此无法做到可以任何人的数据直接替就可以得到自己满意的效果。这段程序主要是一个基于循环神经网络（RNN）的预测模型。它的应用领域可以是时间序列预测、回归分析等。下面我将对程序的运行过程进行详细解释和分析。首先，程序开始时清空环境变量、关闭图窗、清空变量和命令行。然后，通过xlsread函数导入数据，其中'数据的输入'和'数据的输出'是两个Excel文件的文件名。接下来，程序对数据进行归一化处理。首先使用ma

【图像识别】手写文字识别研究附Matlab代码+运行结果.rar: 1.版本：matlab2014/2019a/2024a 2.附赠案例数据可直接运行matlab程序。 3.代码特点：参数化编程、参数可方便更改、代码编程思路清晰、注释明细。 4.适用对象：计算机，电子信息工程、数学等专业的大学生课程设计、期末大作业和毕业设计。

旅游管理系统(基于springboot,mysql,java).zip: 旅游管理系统中的功能模块主要是实现管理员；首页、个人中心、用户管理、旅游方案管理、旅游购买管理、系统管理，用户；首页、个人中心、旅游方案管理、旅游购买管理、我的收藏管理。前台首页；首页、旅游方案、旅游资讯、个人中心、后台管理等功能。经过认真细致的研究，精心准备和规划，最后测试成功，系统可以正常使用。分析功能调整与旅游管理系统实现的实际需求相结合，讨论了Java开发旅游管理系统的使用。从上面的描述中可以基本可以实现软件的功能： 1、开发实现旅游管理系统的整个系统程序； 2、管理员；首页、个人中心、用户管理、旅游方案管理、旅游购买管理、系统管理等。 3、用户：首页、个人中心、旅游方案管理、旅游购买管理、我的收藏管理。 4、前台首页：首页、旅游方案、旅游资讯、个人中心、后台管理等相应操作； 5、基础数据管理：实现系统基本信息的添加、修改及删除等操作，并且根据需求进行交流查看及回复相应操作。

Boost二级升压光伏并网结构的Simulink建模与MPPT最大功率点追踪：基于功率反馈的扰动观察法调整电压方向研究,Boost二级升压光伏并网结构的Simulink建模与MPPT最大功率点追踪：基: Boost二级升压光伏并网结构的Simulink建模与MPPT最大功率点追踪：基于功率反馈的扰动观察法调整电压方向研究,Boost二级升压光伏并网结构的Simulink建模与MPPT最大功率点追踪：基于功率反馈的扰动观察法调整电压方向研究,Boost二级升压光伏并网结构，Simulink建模，MPPT最大功率点追踪，扰动观察法采用功率反馈方式，若ΔP>0，说明电压调整的方向正确，可以继续按原方向进行“干扰”；若ΔP<0，说明电压调整的方向错误，需要对“干扰”的方向进行改变。 ,Boost升压;光伏并网结构;Simulink建模;MPPT最大功率点追踪;扰动观察法;功率反馈;电压调整方向。,光伏并网结构中Boost升压MPPT控制策略的Simulink建模与功率反馈扰动观察法

基于matlab平台的图像去雾设计.zip: 运行GUI版本，可二开

Deepseek相关参考资源文档: Deepseek相关主题资源及行业影响

WP Smush Pro3.16.12 一款专为 WordPress 网站设计的图像优化插件开心版.zip: WP Smush Pro 是一款专为 WordPress 网站设计的图像优化插件。一、主要作用图像压缩它能够在不影响图像质量的前提下，大幅度减小图像文件的大小。例如，对于一些高分辨率的产品图片或者风景照片，它可以通过先进的压缩算法，去除图像中多余的数据。通常 JPEG 格式的图像经过压缩后，文件大小可以减少 40% – 70% 左右。这对于网站性能优化非常关键，因为较小的图像文件可以加快网站的加载速度。该插件支持多种图像格式的压缩，包括 JPEG、PNG 和 GIF。对于 PNG 图像，它可以在保留透明度等关键特性的同时，有效地减小文件尺寸。对于 GIF 图像，也能在一定程度上优化文件大小，减少动画 GIF 的加载时间。懒加载 WP Smush Pro 实现了图像懒加载功能。懒加载是一种延迟加载图像的技术，当用户滚动页面到包含图像的位置时，图像才会加载。这样可以避免一次性加载大量图像，尤其是在页面内容较多且包含许多图像的情况下。例如，在一个新闻网站的长文章页面，带有大量配图，懒加载可以让用户在浏览文章开头部分时，不需要等待所有图片加载，从而提高页面的初始加载速度，同时也能

1. Download this file: https://cdn-media.huggingface.co/frpc-gradio-0.3/frpc-windows-amd64.exe: Could not create share link. Missing file: C:\Users\xx\.conda\envs\omni\Lib\site-packages\gradio\frpc_windows_amd64_v0.3 1. Download this file: https://cdn-media.huggingface.co/frpc-gradio-0.3/frpc_windows_amd64.exe 2. Rename the downloaded file to: frpc_windows_amd64_v0.3 3. Move the file to this location: C:\Users\xx\.conda\envs\omni\Lib\site-packages\gradio

最近访客 更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论