转自;http://www.quora.com/How-does-YARN-compare-to-Mesos
Both systems have the same goal: allowing you to share a large cluster of machines between different frameworks.
For those who don't know, NextGen MapReduce is a project to factor the existing MapReduce into a generic layer that handles distributed process execution and resource scheduling (this system is called YARN) and
then implement MapReduce as an "application" on top of this.
Mesos was originally an academic research project with a very similar goal. They created a system which could run a patched version of Hadoop, MPI and other things. This has grown into an Apache Incubator project
in its own right.
I have been looking into these two a bit because we would love something like this at LinkedIn, and the nature of these things is that you really only want one (since you want to run everything on it). So at
the moment we don't have any real experience running stuff on top of either of these, but here is what I have pieced together (may be wrong in places):
- Nextgen MapReduce (aka YARN) is primarily written in java with bits of native code. Mesos is primarily written in C++.
- YARN only handles memory scheduling (e.g. you request x containers of y MB each), but with plans to extend it to other resources. I believe Mesos handles both memory and CPU scheduling, but I don't know the details. In practice I think the OS handles CPU
scheduling pretty well so I am not sure that would help our use cases. Supporting some kind of disk space and disk I/O scheduling and enforcement would be super cool, but I don't think either do that (yet).
- Mesos uses Linux container groups (http://lxc.sourceforge.n<wbr>et</wbr>), and YARN uses simple unix processes. Linux container
groups are a stronger isolation but may have some additional overhead.
- The resource request model is weirdly backwards in Mesos. In YARN you (the framework) request containers with a given specification and give locality preferences. In Mesos you get resource "offers" and choose to accept or reject those based on your own
scheduling policy. The Mesos model is a arguably more flexible, but seemingly more work for the person implementing the framework.
- YARN is a pretty epic chunk of code, including all kinds of things right down to its own web framework. It is about 3x as much code as Mesos.
- YARN integrates something similar to the pluggable schedulers everyone knows and loves/hates in Hadoop. So if you are used to the capacity scheduler, hierarchical queues, and all that, you can get something similar. I don't think the Mesos scheduling capabilities
are quite as robust (they list hierarchical scheduling on their roadmap).
- YARN integrates with Kerberos and essentially inherits the Hadoop security architecture. I don't think Mesos attempts to deal with security.
- YARN directly handles rack and machine locality in your requests, which is convenient. In Mesos you can implement this, but it is less out of the box.
- Mesos is much more mature as a project at this point. It is a standalone thing, with great documentation, and good starter examples. YARN exists only on hadoop trunk (and some feature branches) in the mapreduce directory, and the docs are super sparse.
That said, the Hadoop guys have been really awesome at helping us get started with YARN (thanksArun!) and they seem really committed to making sure it works
as a general purpose framework, not just for Hadoop. There seems to be a lot of momentum, it is just early.
- YARN is going to be the basis for Hadoop MapReduce going forward, so if you have a big Hadoop cluster and want to be able to run other stuff on it, that is likely appealing and will probably work more transparently than Mesos.
- YARN was written by the Yahoo/HortonWorks Hadoop team which has should know a thing or two about multi-tenancy and very large-scale cluster computing. YARN is not yet in a stable Hadoop release so I am not sure how much actual testing it has had or the
extent of deployment internally at Yahoo. Regardless, if/when the YARN team is able to get the majority of the worlds Hadoop clusters successfully running on top of YARN, that will likely get the project to a level of hardening that will be hard to compete
with.
- Mesos ships with a number of out-of-the-box frameworks ported to it. This somewhat helps to validate the generality of their framework, but i don't know how much of a hack the various ports of things to it are.
Here are a few pointers for folks trying to find out more about Mesos:
- Docs:http://www.mesosproject.o<wbr>rg/docu...</wbr>
- Papers:http://www.mesosproject.o<wbr>rg/rese...</wbr>
- Sample framework implementations:https://github.com/mesos/<wbr>mesos/t...</wbr>
Here are some pointers on YARN:
- Master JIRA:https://issues.apache.org<wbr>/jira/b...</wbr>
- Article on the new resource scheduler:http://developer.yahoo.co<wbr>m/blogs...</wbr>
- Design document for YARN. This is really essential for understanding their terminology of application masters, resource manager, etc. Before we found this, just looking at code, we were lost.https://issues.apache.org<wbr>/jira/s...</wbr>
- Spark, an iterative machine learning framework, has been ported to YARN, and serves as a great example of how to do this:https://github.com/mesos/<wbr>spark-yarn</wbr>
There is a thread on the Mesos mailing list that discusses differences further:http://mail-archives.apac<wbr>he.org/...</wbr>
分享到:
相关推荐
Hadoop的YARN架构是Hadoop版本2.x引入的一个重要组件,它负责处理资源管理和作业调度,而核心的计算任务处理则交给了MapReduce、Tez、Spark等计算框架。YARN的出现是为了解决Hadoop早期版本中的可扩展性问题,它通过...
Yarn是JavaScript社区广泛使用的包管理工具,它旨在提高npm(Node.js的包管理器)的性能、可预测性和安全性。这两个文件,"yarn-1.22.4.msi" 和 "yarn-1.22.5.msi",是Yarn的特定版本安装程序,适用于Windows操作...
**Yarn 1.6 在Windows上的安装指南** Yarn是一款高效的依赖管理工具,它为JavaScript项目提供了可靠的、可重复的以及快速的包管理解决方案。本指南将详细介绍如何在Windows操作系统上安装Yarn 1.6版本。 1. **了解...
npm install -g yarn yarn install 安装失败,使用官方下载的yarn.lock文件
Yarn 获取 Application 列表编码 Yarn 是一个资源管理和调度框架,负责管理 Hadoop 集群中的资源和应用程序。获取 Application 列表编码是 Yarn 中的一种常见操作,本文将对其进行详细的分析和介绍。 Yarn 获取 ...
在前端开发领域,`Yarn` 是一个非常流行的依赖管理工具,它被广泛用于替代 `npm` 进行包的安装和管理。`Yarn` 提供了更快的速度、更可靠的重复性和更好的安全性。在某些情况下,我们可能需要对前端项目进行特定的...
**Yarn 1.22.0:高效且可靠的JavaScript包管理器** Yarn是Facebook在2016年推出的JavaScript包管理工具,它旨在解决npm(Node.js的默认包管理器)在处理依赖关系时的一些痛点,如不一致的安装结果、缓慢的安装速度...
为此,7月2日晚,在CSDNSpark高端微信群中,一场基于YARN和Mesos的讨论被拉开,主要参与分享的嘉宾包括TalkingData研发副总裁阎志涛,GrowingIO田毅,AdMaster技术副总裁卢亿雷,SparkCommitter、Mesos/...
yarn安装 Yarn 是一个流行的 JavaScript 包管理器,可以帮助开发者快速安装、管理和共享项目依赖项。下面是关于 Yarn 安装的详细知识点: Yarn 安装步骤 1. 使用 npm 安装 Yarn:`npm install –g yarn` 或 `npm ...
在分布式计算领域,Apache Hadoop YARN(Yet Another Resource Negotiator)是核心组件之一,它作为资源管理系统,负责调度和管理Hadoop集群上的应用程序。在这个主题中,我们将深入探讨"Yarn编程ApplicationList",...
此外,Yarn提供了`yarn upgrade`、`yarn remove`、`yarn info`等命令,用于升级依赖、移除依赖和查看依赖信息。还有`yarn install --offline`命令,可以在没有网络的情况下利用本地缓存安装依赖。 总之,Yarn作为一...
Yarn 对你的代码来说是一个包管理器, 你可以通过它使用全世界开发者的代码,或者分享自己的代码。 Yarn 做这些快捷、安全、可靠,所以你不用担心什么。 通过Yarn你可以使用其他开发者针对不同问题的解决方案,使...
Yarn是Facebook开发的一款流行的JavaScript包管理工具,它在npm(Node Package Manager)的基础上提供了更快、更可靠和更安全的包管理和依赖关系解决方式。在本文中,我们将深入探讨Yarn 1.22.4和1.22.5这两个版本在...
Yarn是JavaScript的世界中一个流行的包管理工具,它在2016年由Facebook推出,旨在解决npm(Node Package Manager)的一些性能和可预测性问题。`yarn-v1.19.1.tar.gz`是一个包含Yarn源码的压缩包,版本号为1.19.1。这...
Yarn 的使用方法基本与 npm 类似,比如 `yarn init` 创建新项目,`yarn add` 添加依赖,`yarn remove` 移除依赖,`yarn upgrade` 升级依赖,`yarn install` 安装项目依赖等。然而,由于 Yarn 的特性,这些操作的执行...
Yarn essential resource management, hands on cookbook on how to use yarn configurations to optimize resources
yarn-1.22.10编译工具yarn-1.22.10编译工具 yarn-1.22.10编译工具yarn-1.22.10编译工具 yarn-1.22.10编译工具yarn-1.22.10编译工具 yarn-1.22.10编译工具yarn-1.22.10编译工具 yarn-1.22.10编译工具yarn-1.22.10编译...
**Yarn 1.22.5 Windows MSI 安装详解** Yarn 是一个现代的、高性能的包管理器,它被广泛应用于JavaScript开发中,用于管理和安装项目依赖。相较于npm,Yarn 提供了更稳定的环境、更快的安装速度以及更好的并行处理...
synp --from npm --to yarn ``` 在转换过程中,需要注意的是,由于两个锁文件的格式和包含的信息不完全相同,转换可能会丢失某些细节,因此建议在团队中统一使用一种包管理器以减少潜在问题。 ### 结论 `yarn.lock...