- 浏览: 223462 次
- 性别:
- 来自: 上海
文章分类
最新评论
-
Breather.杨:
斯库伊!受教
基于按annotation的hibernate主键生成策略 -
w420372197:
很详细,学习中..转载了
基于按annotation的hibernate主键生成策略 -
wslovenide:
...
基于按annotation的hibernate主键生成策略 -
Navee:
写的十分详细!感谢
基于按annotation的hibernate主键生成策略 -
eric.cheng:
很好,学习了
基于按annotation的hibernate主键生成策略
Google Architecture
Sun, 11/23/2008 - 02:01 — Todd Hoff
Update 2:
Sorting 1 PB with MapReduce
. PB is not peanut-butter-and-jelly misspelled. It's 1 petabyte or 1000 terabytes or 1,000,000 gigabytes. It took six hours and two minutes to sort 1PB (10 trillion 100-byte records) on 4,000 computers
and the results were replicated thrice on 48,000 disks.
Update:
Greg Linden
points to a new Google article MapReduce: simplified data processing on large clusters
.
Some interesting stats: 100k MapReduce jobs are executed each day; more
than 20 petabytes of data are processed per day; more than 10k
MapReduce programs have been implemented; machines are dual processor
with gigabit ethernet and 4-8 GB of memory.
Google is the King of scalability. Everyone knows Google for their large, sophisticated, and fast searching, but they don't just shine in search. Their platform approach to building scalable applications allows them to roll out internet scale applications at an alarmingly high competition crushing rate. Their goal is always to build a higher performing higher scaling infrastructure to support their products. How do they do that?
Information Sources
Platform
What's Inside?
The Stats
The Stack
Google visualizes their infrastructure as a three layer stack:
Reliable Storage Mechanism with GFS (Google File System)
- high reliability across data centers
- scalability to thousands of network nodes
- huge read/write bandwidth requirements
- support for large blocks of data which are gigabytes in size.
- efficient distribution of operations across nodes to reduce bottlenecks
- Master servers keep metadata on the various data files. Data are stored in the file system in 64MB chunks. Clients talk to the master servers to perform metadata operations on files and to locate the chunk server that contains the needed they need on disk.
- Chunk servers store the actual data on disk. Each chunk is replicated across three different chunk servers to create redundancy in case of server crashes. Once directed by a master server, a client application retrieves files directly from chunk servers.
Do Something With the Data Using MapReduce
- Nice way to partition tasks across lots of machines.
- Handle machine failure.
- Works across different application types, like search and ads. Almost every application has map reduce type operations. You can precompute useful data, find word counts, sort TBs of data, etc.
- Computation can automatically move closer to the IO source.
- The Master server assigns user tasks to map and reduce servers. It also tracks the state of the tasks.
- The Map servers accept user input and performs map operations on them. The results are written to intermediate files
- The Reduce servers accepts intermediate files produced by map servers and performs reduce operation on them.
- The steps look like: GFS -> Map -> Shuffle -> Reduction -> Store Results back into GFS.
- In MapReduce a map maps one view of data to another, producing a key value pair, which in our example is word and count.
- Shuffling aggregates key types.
- The reductions sums up all the key value pairs and produces the final answer.
Storing Structured Data in BigTable
- The Master servers assign tablets to tablet servers. They track where tablets are located and redistributes tasks as needed.
- The Tablet servers process read/write requests for tablets. They split tablets when they exceed size limits (usually 100MB - 200MB). When a tablet server fails, then a 100 tablet servers each pickup 1 new tablet and the system recovers.
- The Lock servers form a distributed lock service. Operations like opening a tablet for writing, Master aribtration, and access control checking require mutual exclusion.
Hardware
Misc
Future Directions for Google
Lessons Learned
发表评论
-
大型网站架构不得不考虑的10个问题
2009-01-16 14:41 1155大型网站架构不得不考虑的10个问题 来自CSDN:http:/ ... -
规划 SOA 参考架构
2009-01-07 16:22 2477规划 SOA 参考架构 2007-12-03 09: ... -
架构师书单
2009-01-07 16:09 1719架构师书单 一、S ... -
架构师之路
2009-01-07 16:07 5131架构师之路 什么是软件架构师? 架构 ... -
应用架构选型讨论
2008-12-10 09:29 1224应用架构选型讨论(PPT) ... -
系统构架设计应考虑的因素
2008-11-24 17:23 3248系统构架设计应考虑的 ... -
负载均衡--大型在线系统实现的关键(服务器集群架构的设计与选择)
2008-11-24 17:19 5719负载均衡--大型在 ... -
LinkedIn Architecture
2008-11-24 16:16 1616LinkedIn Architecture Category ... -
eBay Architecture
2008-11-24 16:14 1943eBay Architecture Tue, 05/27/2 ... -
LiveJournal Architecture
2008-11-24 16:13 1096LiveJournal Architecture Mon, ... -
YouTube Architecture
2008-11-24 16:07 1536YouTube Architecture Thu, 03/1 ... -
Flickr Architecture
2008-11-24 16:05 1333Flickr Architecture Wed, 11/14 ... -
Digg Architecture
2008-11-24 16:03 1302Digg Architecture Mon, 09/15/2 ... -
37signals Architecture
2008-11-24 16:02 118837signals Architecture Thu, 09 ... -
Scaling Twitter: Making Twitter 10000 Percent Fast
2008-11-24 15:59 1293Scaling Twitter: Making Twitter ... -
Amazon Architecture
2008-11-24 15:58 1217Amazon Architecture Tue, 09/18 ... -
Facebook 海量数据处理
2008-11-24 15:54 1855Facebook 海量数据处理 作者: F ... -
Scalability Best Practices: Lessons from eBay
2008-11-24 15:50 1152Scalability Best Practices: Le ... -
Yapache-Yahoo! Apache 的秘密
2008-11-24 02:15 1193Yapache-Yahoo! Apache 的秘密 作 ... -
Notes from Scaling MySQL - Up or Out
2008-11-24 02:14 1502Notes from Scaling MySQL - Up o ...
相关推荐
http://highscalability.com/google-architecture http://weibo.com/developerworks 2012-11-11 整理 第 1/9页 Large Clusters 4. Google Lab: BigTable. 5. Video: BigTable: A Distributed Structured Storage ...
"google architecture sample"就是一个这样的项目,它包含了多种不同的应用程序架构模式,以供开发者学习和参考。以下是对这些架构模式的详细说明: 1. **MVP(Model-View-Presenter)**: MVP是一种流行的设计...
### Google架构核心知识点 #### 一、概述 Google作为一个全球领先的科技公司,在处理大规模数据集方面有着独到的技术优势。其技术栈的核心是基于一系列专有的分布式计算框架和技术,包括但不限于BigTable、...
google-analytics-architecture
"Android Architecture_googlesamples.zip"是谷歌提供的一组开源项目,旨在帮助开发者理解和实践各种Android应用程序的体系结构工具和模式。这个压缩包中的"architecture-samples-master"目录,包含了多个示例应用,...
集成模块,用于将Google Architecture Components的注入注入Android活动和片段。 该库的灵感来自于官方的示例。 安装 在Android build.gradle文件的“ dependencies部分中添加以下行之一: implementation '...
Android Architecture Blueprints The Android framework provides a lot of flexibility in deciding how to organize and architect an Android app. While this freedom is very valuable, it can also lead to ...
For over 20 years, Computer Architecture: A Quantitative Approach has been considered essential reading by instructors, students, and practitioners of computer design. The latest edition of this ...
It also includes a new chapter on domain-specific architectures and an updated chapter on warehouse-scale computing that features the first public information on Google's newest WSC.
"Google TPU V3 Codesigning Architecture and Infrastructure" Google TPU V3 是一种高度定制的硬件架构,旨在为机器学习和深度学习应用程序提供高性能计算能力。该架构通过 codesigning 方式与软件编程模型紧密...
### 北航云计算公开课05a:Google存储架构与挑战 #### 一、引言 在本次公开课中,Google的首席工程师安德鲁·菲克斯(Andrew Fikes)分享了Google如何构建和维护其全球规模的存储系统。他强调了几个关键点,包括...
《Android4TV - SW Architecture v1.5》是关于Android TV软件架构的详细解析文档,主要针对Java开发者。本文将深入探讨Android TV平台的软件架构,包括其核心组件、服务、应用程序开发要点以及与传统Android系统的...
最近在学习 computer architecture a quantitative approach 5th edition,发现很多下载的答案没有appendix,用Google找了很久找到了appendix的答案,希望给大家带来帮助
Flutter作为Google推出的一款强大的移动应用开发框架,以其高性能、跨平台的特性受到了广大开发者们的喜爱。本压缩包“flutter-architecture-blueprints-源码”为我们揭示了Flutter应用架构设计的蓝图,通过源码分析...
Android-android-mvvm-architecture.zip,此存储库包含一个详细的示例应用程序,该应用程序使用dagger2、room、rxjava2、fastdroidnetworking和placeholderview实现mvvm体系结构,安卓系统是谷歌在2008年设计和制造的...
Amazon Web Services的James Hamilton提到,只有Hennessy和Patterson才能接触到谷歌、亚马逊、微软等云服务和互联网规模应用提供商的内部人士,因此书中对这一领域的覆盖在业界中无出其右。 本书还介绍了大规模...
它将部分软件运行在PMD上,其余部分则运行在云端,亚马逊和谷歌是这方面的佼佼者。 在学习计算机组织与架构的过程中,我们将会了解程序是如何被翻译成机器语言,以及硬件是如何执行这些机器语言的。我们还将研究...