Apache Hadoop YARN – ResourceManager
As previously described, ResourceManager (RM) is the master that arbitrates all the available cluster resources and thus helps manage the distributed applications running on the YARN system. It works together with the per-node NodeManagers (NMs) and the per-application ApplicationMasters (AMs).
- NodeManagers take instructions from the ResourceManager and manage resources available on a single node.
- ApplicationMasters are responsible for negotiating resources with the ResourceManager and for working with the NodeManagers to start the containers.
ResourceManager Components
The ResourceManager has the following components (see the figure above):
-
Components interfacing RM to the clients:
- ClientService: The client interface to the Resource Manager. This component handles all the RPC interfaces to the RM from the clients including operations like application submission, application termination, obtaining queue information, cluster statistics etc.
- AdminService: To make sure that admin requests don’t get starved due to the normal users’ requests and to give the operators’ commands the higher priority, all the admin operations like refreshing node-list, the queues’ configuration etc. are served via this separate interface.
-
Components connecting RM to the nodes:
- ResourceTrackerService: This is the component that responds to RPCs from all the nodes. It is responsible for registration of new nodes, rejecting requests from any invalid/decommissioned nodes, obtain node-heartbeats and forward them over to the YarnScheduler. It works closely with NMLivelinessMonitor and NodesListManager described below.
- NMLivelinessMonitor: To keep track of live nodes and specifically note down the dead nodes, this component keeps track of each node’s its last heartbeat time. Any node that doesn’t heartbeat within a configured interval of time, by default 10 minutes, is deemed dead and is expired by the RM. All the containers currently running on an expired node are marked as dead and no new containers are scheduling on such node.
-
NodesListManager: A collection of valid and excluded nodes. Responsible for reading the host configuration files specified via
yarn.resourcemanager.nodes.include-path
andyarn.resourcemanager.nodes.exclude-path
and seeding the initial list of nodes based on those files. Also keeps track of nodes that are decommissioned as time progresses.
-
Components interacting with the per-application AMs:
- ApplicationMasterService: This is the component that responds to RPCs from all the AMs. It is responsible for registration of new AMs, termination/unregister-requests from any finishing AMs, obtaining container-allocation & deallocation requests from all running AMs and forward them over to the YarnScheduler. This works closely with AMLivelinessMonitor described below.
- AMLivelinessMonitor: To help manage the list of live AMs and dead/non-responding AMs, this component keeps track of each AM and its last heartbeat time. Any AM that doesn’t heartbeat within a configured interval of time, by default 10 minutes, is deemed dead and is expired by the RM. All the containers currently running/allocated to an AM that gets expired are marked as dead. RM schedules the same AM to run on a new container, allowing up to a maximum of 4 such attempts by default.
-
The core of the ResourceManager – the scheduler and related components:
- ApplicationsManager: Responsible for maintaining a collection of submitted applications. Also keeps a cache of completed applications so as to serve users’ requests via web UI or command line long after the applications in question finished.
- ApplicationACLsManager: RM needs to gate the user facing APIs like the client and admin requests to be accessible only to authorized users. This component maintains the ACLs lists per application and enforces them whenever an request like killing an application, viewing an application status is received.
- ApplicationMasterLauncher: Maintains a thread-pool to launch AMs of newly submitted applications as well as applications whose previous AM attempts exited due to some reason. Also responsible for cleaning up the AM when an application has finished normally or forcefully terminated.
- YarnScheduler: The Scheduler is responsible for allocating resources to the various running applications subject to constraints of capacities, queues etc. It performs its scheduling function based on the resource requirements of the applications such as memory, CPU, disk, network etc. Currently, only memory is supported and support for CPU is close to completion.
- ContainerAllocationExpirer: This component is in charge of ensuring that all allocated containers are used by AMs and subsequently launched on the correspond NMs. AMs run as untrusted user code and can potentially hold on to allocations without using them, and as such can cause cluster under-utilization. To address this, ContainerAllocationExpirer maintains the list of allocated containers that are still not used on the corresponding NMs. For any container, if the corresponding NM doesn’t report to the RM that the container has started running within a configured interval of time, by default 10 minutes, the container is deemed as dead and is expired by the RM.
-
TokenSecretManagers (for security):ResourceManager has a collection of SecretManagers which are charged with managing tokens, secret-keys that are used to authenticate/authorize requests on various RPC interfaces. A future post on YARN security will cover a more detailed descriptions of the tokens, secret-keys and the secret-managers but a brief summary follows:
- ApplicationTokenSecretManager: To avoid arbitrary processes from sending RM scheduling requests, RM uses the per-application tokens called ApplicationTokens. This component saves each token locally in memory till application finishes and uses it to authenticate any request coming from a valid AM process.
- ContainerTokenSecretManager: SecretManager for ContainerTokens that are special tokens issued by RM to an AM for a container on a specific node. ContainerTokens are used by AMs to create a connection to the corresponding NM where the container is allocated. This component is RM-specific, keeps track of the underlying master and secret-keys and rolls the keys every so often.
- RMDelegationTokenSecretManager: A ResourceManager specific delegation-token secret-manager. It is responsible for generating delegation tokens to clients which can be passed on to unauthenticated processes that wish to be able to talk to RM.
- DelegationTokenRenewer: In secure mode, RM is Kerberos authenticated and so provides the service of renewing file-system tokens on behalf of the applications. This component renews tokens of submitted applications as long as the application runs and till the tokens can no longer be renewed.
Conclusion
In YARN, the ResourceManager is primarily limited to scheduling i.e. only arbitrating available resources in the system among the competing applications and not concerning itself with per-application state management. Because of this clear separation of responsibilities coupled with the modularity described above, and with the powerful scheduler API discussed in the previous post, RM is able to address the most important design requirements – scalability, support for alternate programming paradigms.
To allow for different policy constraints, the scheduler described above in the RM is pluggable and allows for different algorithms. In a future post of this series, we will dig deeper into various features of CapacityScheduler that schedules containers based on capacity guarantees and queues.
The next post will dive into details of the NodeManager, the component responsible for managing the containers’ life cycle and much more.
相关推荐
1. **编辑 `yarn-site.xml` 文件**,设置 `yarn.resourcemanager.scheduler.class` 参数为 `org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler`。 2. **重启 YARN 服务**,确保...
HDP 是 Hortonworks 公司的发行版,而 CDH 是 Cloudera 公司的 Hadoop 发行版。不同的发行版在架构、部署和使用方法上是一致的,不同之处仅在若干内部实现。 Hadoop 内核包括分布式存储系统 HDFS、资源管理系统 ...
3. **Hadoop发行版**:选择适合的Hadoop发行版,比如Apache Hadoop或者预配置的Hadoop发行版,如Cloudera CDH或Hortonworks HDP。对于初学者,推荐使用预配置的发行版,因为它们通常包含了所有必要的依赖和配置。 4...
在选择 Hadoop 版本时,用户可以选择社区版(如 Apache 提供的版本)或商业版(如 Cloudera、Hortonworks、MapR 等)。商业版通常提供额外的支持、集成服务和管理工具,适合企业级应用,而社区版则更适合开发者和...
在大数据领域,Hortonworks是一个知名的开源公司,专注于Apache Hadoop及相关开源项目的企业级支持。Hortonworks Data Platform(HDP)是其核心产品,提供了全面的数据管理解决方案。对于开发、测试和学习Hadoop生态...
HDP(Hortonworks Data Platform)是一款基于Apache Hadoop的企业级大数据平台,提供了全面的大数据管理和分析解决方案。HDP不仅包含了Hadoop的核心组件,如HDFS(Hadoop Distributed File System)、MapReduce、...
软件环境包括操作系统(如Linux)、Hadoop发行版(如Cloudera、Hortonworks或Apache Hadoop)、以及相关的数据处理工具(如Hive、Pig、Spark等)。测试中需要确保所有组件的版本兼容,且已按照最佳实践进行配置。 3...
压缩包中的 "hdp-test" 文件可能是一个针对HDP(Hortonworks Data Platform)的测试设置或配置,HDP是一个包含Hadoop和其他相关开源项目的商业发行版。在实际操作中,可能会包含特定的配置文件、测试数据或者脚本,...
在实际应用中,为了在Windows上更方便地使用Hadoop,许多人会选择安装预编译的Hadoop发行版,如Cloudera的CDH或者 Hortonworks的HDP,它们提供了Windows支持和图形化的管理界面。然而,如果你选择自己编译,不仅可以...
HDP是由Hortonworks公司提供的一个全面的开源大数据平台,它包含了Apache Hadoop及其相关的开源项目,如HBase、Hive、YARN、Spark等,旨在提供数据存储、处理、分析和管理的一站式解决方案。 2. **Hadoop生态系统*...
Apache Ambari 是一个开源项目,由 Apache 软件基金会维护,主要用于简化 Hadoop 生态系统的部署、管理和监控。Ambari 提供了一个直观的 Web 用户界面,使得集群的安装配置过程变得更为简单,同时也提供了 REST API ...
HDP(Hortonworks Data Platform)是Hadoop的一种商业化发行版,它包含了Hadoop生态系统中的一系列组件,如HDFS、MapReduce、YARN、HBase、Hive、ZooKeeper等。HDP在稳定性、安全性以及企业级支持方面有着出色的表现...
Hadoop的核心组件包括NameNode、DataNode、JobTracker和TaskTracker,在Hadoop 1.0中,JobTracker负责资源管理和任务调度,而在Hadoop 2.0(YARN)中,资源管理功能被分离出来,由ResourceManager负责。 2. **企业...
Redoop CRH 4.9 是北京红象云腾系统技术有限公司推出的一款大数据集群管理软件,它可以帮助用户更加便捷地管理Hadoop生态系统的各个组件,如HDFS、YARN、MapReduce等,实现集群的集中管理和监控。 ##### 1.2 关于本...