this application is introduced to run shell command in distributed nodes(containers) as it named,so it's is ealy and let's to go ahead.
1.run 'ls' command in containers
2.which path does that command run on ?
3.how to run meaningful commands depend on nodes
1.run 'ls' command in containers
hadoop jar share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.5.1.jar org.apache.hadoop.yarn.applications.distributedshell.Client -jar share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.5.1.jar -shell_command ls -num_containers 1 -container_memory 300 -master_memory 400
so the command 'ls' will run on any containers .and the result will like this:
more userlogs/application_1433385109839_0001/container_1433385109839_0001_01_000002/stdout container_tokens default_container_executor.sh launch_container.sh tmp
why this file contains these content?u can lookk into the <nodemanager.log>
2015-06-04 15:55:10,424 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: Resource hdfs://localhost:9000/user/userxx/DistributedShell/application_1433403689317_0001/AppMaster.jar(->/usr/local/hadoop/data-2.5.1/tmp/nm-local-dir/usercache/userxx/appcache/application_1433403689317_0001/filecache/10/AppMaster.jar) transitioned from DOWNLOADING to LOCALIZED 2015-06-04 15:55:10,502 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: Resource hdfs://localhost:9000/user/userxx/DistributedShell/application_1433403689317_0001/shellCommands(->/usr/local/hadoop/data-2.5.1/tmp/nm-local-dir/usercache/userxx/appcache/application_1433403689317_0001/filecache/11/shellCommands) transitioned from DOWNLOADING to LOCALIZED 2015-06-04 15:55:10,644 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: launchContainer: [nice, -n, 0, bash, /usr/local/hadoop/data-2.5.1/tmp/nm-local-dir/usercache/userxx/appcache/application_1433403689317_0001/container_1433403689317_0001_01_000001/default_container_executor.sh]
u will see there is a file named 'defaultc_container_executor.sh' placed in the working dir(current container name).so the result from this command is correct.
2.which path does that command run on ?
yes,the result is absoulte right,but how to verify to current working dir is lied in 'container_1433385109839_0001_01_000001'?
of course,it 's simple too,u can use 'pwd' instead of 'ls' for the shell_command param.
hadoop jar share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.5.1.jar org.apache.hadoop.yarn.applications.distributedshell.Client -jar share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.5.1.jar -shell_command pwd -num_containers 1 -container_memory 300 -master_memory 400
now ,check out the stdout file,the result will like this:
/usr/local/hadoop/data-2.5.1/tmp/nm-local-dir/usercache/userxx/appcache/application_1433403689317_0002/container_1433403689317_0002_01_000002
but this time,the dir is bit differences from point 1,as this is the second app;)
3.how to run meaningful commands depend on nodes
but u if want to use a *custom script*(use some params in command params) to run on *node-specified*(ie different result for different nodes),u can use a script file to achieve this:
hadoop jar share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.5.1.jar org.apache.hadoop.yarn.applications.distributedshell.Client -jar share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.5.1.jar -shell_script ls-command.sh -num_containers 1 -container_memory 300 -master_memory 400
and the file 'ls-command.sh' is simple:
ls -al /tmp/
yep,this file must be alllowed to be executable,so do it prior to run this command:
chmod +x ls-command.sh
appendix:
A. from the <nodemanager.log>,we found this info:
2015-06-04 15:55:17,223 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Application application_1433403689317_0001 transitioned from RUNNING to APPLICATION_RESOURCES_CLEANINGUP 2015-06-04 15:55:17,223 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting absolute path : /usr/local/hadoop/data-2.5.1/tmp/nm-local-dir/usercache/userxx/appcache/application_1433403689317_0001
so if u check out the final dir appache,nothing will be there:
ll /usr/local/hadoop/data-2.5.1/tmp/nm-local-dir/usercache/userxx/appcache/ total 0
B.the AM is responsible for setupping the containers.yeah,finally the NM will startup the containers
more userlogs/application_1433385109839_0001/container_1433385109839_0001_01_000001/AppMaster.stderr 15/06/04 12:26:09 INFO distributedshell.ApplicationMaster: Initializing ApplicationMaster 15/06/04 12:26:09 INFO distributedshell.ApplicationMaster: Application master for app, appId=1, clustertimestamp=1433385109839, attemptId=1 2015-06-04 12:26:09.755 java[1261:1903] Unable to load realm info from SCDynamicStore 15/06/04 12:26:09 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 15/06/04 12:26:10 INFO impl.TimelineClientImpl: Timeline service is not enabled 15/06/04 12:26:10 INFO distributedshell.ApplicationMaster: Starting ApplicationMaster 15/06/04 12:26:10 INFO distributedshell.ApplicationMaster: Executing with tokens: 15/06/04 12:26:10 INFO distributedshell.ApplicationMaster: Kind: YARN_AM_RM_TOKEN, Service: , Ident: (org.apache.hadoop.yarn.security.AMRMTokenIdentifier@7950d786) 15/06/04 12:26:10 INFO client.RMProxy: Connecting to ResourceManager at localhost/127.0.0.1:8030 15/06/04 12:26:10 INFO impl.NMClientAsyncImpl: Upper bound of the thread pool size is 500 15/06/04 12:26:10 INFO impl.ContainerManagementProtocolProxy: yarn.client.max-nodemanagers-proxies : 500 15/06/04 12:26:10 INFO distributedshell.ApplicationMaster: Max mem capabililty of resources in this cluster 8192 15/06/04 12:26:10 INFO distributedshell.ApplicationMaster: Max vcores capabililty of resources in this cluster 32 15/06/04 12:26:10 INFO distributedshell.ApplicationMaster: Received 0 previous AM's running containers on AM registration. 15/06/04 12:26:10 INFO distributedshell.ApplicationMaster: Requested container ask: Capability[<memory:300, vCores:1>]Priority[0] 15/06/04 12:26:10 INFO distributedshell.ApplicationMaster: Requested container ask: Capability[<memory:300, vCores:1>]Priority[0] 15/06/04 12:26:12 INFO impl.AMRMClientImpl: Received new token for : localhost:52226 15/06/04 12:26:12 INFO distributedshell.ApplicationMaster: Got response from RM for container ask, allocatedCnt=1 15/06/04 12:26:12 INFO distributedshell.ApplicationMaster: Launching shell command on a new container., containerId=container_1433385109839_0001_01_000002, containerNode=localhost:52226, containerNodeURI=localhost:8042, containerResourceMemory1024, containerResourceVirtualCores1 15/06/04 12:26:12 INFO distributedshell.ApplicationMaster: Setting up container launch container for containerid=container_1433385109839_0001_01_000002 15/06/04 12:26:12 INFO impl.NMClientAsyncImpl: Processing Event EventType: START_CONTAINER for Container container_1433385109839_0001_01_000002 15/06/04 12:26:12 INFO impl.ContainerManagementProtocolProxy: Opening proxy : localhost:52226 15/06/04 12:26:12 INFO impl.NMClientAsyncImpl: Processing Event EventType: QUERY_CONTAINER for Container container_1433385109839_0001_01_000002 15/06/04 12:26:13 INFO distributedshell.ApplicationMaster: Got response from RM for container ask, completedCnt=1 15/06/04 12:26:13 INFO distributedshell.ApplicationMaster: Got container status for containerID=container_1433385109839_0001_01_000002, state=COMPLETE, exitStatus=0, diagnostics= 15/06/04 12:26:13 INFO distributedshell.ApplicationMaster: Container completed successfully., containerId=container_1433385109839_0001_01_000002 15/06/04 12:26:13 INFO distributedshell.ApplicationMaster: Got response from RM for container ask, allocatedCnt=1 15/06/04 12:26:13 INFO distributedshell.ApplicationMaster: Launching shell command on a new container., containerId=container_1433385109839_0001_01_000003, containerNode=localhost:52226, containerNodeURI=localhost:8042, containerResourceMemory1024, containerResourceVirtualCores1 15/06/04 12:26:13 INFO distributedshell.ApplicationMaster: Setting up container launch container for containerid=container_1433385109839_0001_01_000003 15/06/04 12:26:13 INFO impl.NMClientAsyncImpl: Processing Event EventType: START_CONTAINER for Container container_1433385109839_0001_01_000003 15/06/04 12:26:13 INFO impl.NMClientAsyncImpl: Processing Event EventType: QUERY_CONTAINER for Container container_1433385109839_0001_01_000003 15/06/04 12:26:14 INFO distributedshell.ApplicationMaster: Got response from RM for container ask, completedCnt=1 15/06/04 12:26:14 INFO distributedshell.ApplicationMaster: Got container status for containerID=container_1433385109839_0001_01_000003, state=COMPLETE, exitStatus=0, diagnostics= 15/06/04 12:26:14 INFO distributedshell.ApplicationMaster: Container completed successfully., containerId=container_1433385109839_0001_01_000003 15/06/04 12:26:14 INFO distributedshell.ApplicationMaster: Application completed. Stopping running containers 15/06/04 12:26:14 INFO impl.ContainerManagementProtocolProxy: Closing proxy : localhost:52226 15/06/04 12:26:14 INFO distributedshell.ApplicationMaster: Application completed. Signalling finish to RM 15/06/04 12:26:14 INFO impl.AMRMClientImpl: Waiting for application to be successfully unregistered. 15/06/04 12:26:15 INFO distributedshell.ApplicationMaster: Application Master completed successfully. exiting
and always the AM will start previously at first container then others.
C.questions:my macbook pro is configured by 8g ram and i5(2.4g) two cores cpu,but i found i got a 32 vcores from above:
15/06/04 12:26:10 INFO distributedshell.ApplicationMaster: Max mem capabililty of resources in this cluster 8192 15/06/04 12:26:10 INFO distributedshell.ApplicationMaster: Max vcores capabililty of resources in this cluster 32
anyone knows that?so i will dig into it tomorrow..
after i recreated a new job on a big cluster(32g mem,8 cpus),these info were kept the same,so i thought these are the config values set in code or xml.
today,i dig into 'CapacityScheduler#getMaximumAllocation()'
public Resource getMaximumAllocation() { int maximumMemory = getInt( YarnConfiguration.RM_SCHEDULER_MAXIMUM_ALLOCATION_MB, YarnConfiguration.DEFAULT_RM_SCHEDULER_MAXIMUM_ALLOCATION_MB); int maximumCores = getInt( YarnConfiguration.RM_SCHEDULER_MAXIMUM_ALLOCATION_VCORES, YarnConfiguration.DEFAULT_RM_SCHEDULER_MAXIMUM_ALLOCATION_VCORES); return Resources.createResource(maximumMemory, maximumCores); }
public Resource getMinimumAllocation() { int minimumMemory = getInt( YarnConfiguration.RM_SCHEDULER_MINIMUM_ALLOCATION_MB, YarnConfiguration.DEFAULT_RM_SCHEDULER_MINIMUM_ALLOCATION_MB); int minimumCores = getInt( YarnConfiguration.RM_SCHEDULER_MINIMUM_ALLOCATION_VCORES, YarnConfiguration.DEFAULT_RM_SCHEDULER_MINIMUM_ALLOCATION_VCORES); return Resources.createResource(minimumMemory, minimumCores); }
case | property | default in code | default in xml | description | |
max | xx.scheduler.maximum-allocation-mb | 8g | 8g |
max ram per container. The maximum allocation for every container request at the RM, in MBs. Memory requests higher than this won't take effect,
and will get capped to this value |
|
xx.scheduler.maximum-allocation-vcores | 4cores | 32 cores | max vcores per container. The maximum allocation for every container request at the RM, in terms of virtual CPU cores. Requests higher than this won't take effect, and will get capped to this value |
||
min | xx.scheduler.minimum-allocation-mb | 1g | 1g | ||
xx.scheduler.minimum-allocation-vcore | 1core | 1core |
of course ,there are some questions lied there:
1.if a node configed 4g,and sure a max-allocation-mb should be less or equals than 4g,but now,my task need 5g to run on it,how about it?i think this node will never run any tasks.so a fix resolution is necessary,e.g:
// A resource ask cannot exceed the max. if (amMemory > maxMem) { LOG.info("AM memory specified above max threshold of cluster. Using max value." + ", specified=" + amMemory + ", max=" + maxMem); amMemory = maxMem; }
D.container id does not restrictly follow the app attempt id But app id
container id
container_1433385109839_0001_01_000003
app attempt id
application_1433385109839_0001_00001
app id
application_1433385109839_0001
since one app maybe contain multi attempts,so the container must bind to app id instead of attempt id for umbilical relationship.
ref:
http://dongxicheng.org/mapreduce-nextgen/how-to-run-distributedshell/
相关推荐
赠送jar包:hadoop-yarn-client-2.6.5.jar; 赠送原API文档:hadoop-yarn-client-2.6.5-javadoc.jar; 赠送源代码:hadoop-yarn-client-2.6.5-sources.jar; 赠送Maven依赖信息文件:hadoop-yarn-client-2.6.5.pom;...
赠送jar包:hadoop-yarn-common-2.6.5.jar 赠送原API文档:hadoop-yarn-common-2.6.5-javadoc.jar 赠送源代码:hadoop-yarn-common-2.6.5-sources.jar 包含翻译后的API文档:hadoop-yarn-common-2.6.5-javadoc-...
赠送jar包:hadoop-yarn-api-2.5.1.jar; 赠送原API文档:hadoop-yarn-api-2.5.1-javadoc.jar; 赠送源代码:hadoop-yarn-api-2.5.1-sources.jar; 赠送Maven依赖信息文件:hadoop-yarn-api-2.5.1.pom; 包含翻译后...
赠送jar包:hadoop-yarn-server-resourcemanager-2.6.0.jar; 赠送原API文档:hadoop-yarn-server-resourcemanager-2.6.0-javadoc.jar; 赠送源代码:hadoop-yarn-server-resourcemanager-2.6.0-sources.jar; 赠送...
赠送jar包:hadoop-yarn-api-2.7.3.jar; 赠送原API文档:hadoop-yarn-api-2.7.3-javadoc.jar; 赠送源代码:hadoop-yarn-api-2.7.3-sources.jar; 赠送Maven依赖信息文件:hadoop-yarn-api-2.7.3.pom; 包含翻译后...
赠送jar包:hadoop-yarn-common-2.5.1.jar; 赠送原API文档:hadoop-yarn-common-2.5.1-javadoc.jar; 赠送源代码:hadoop-yarn-common-2.5.1-sources.jar; 包含翻译后的API文档:hadoop-yarn-common-2.5.1-...
赠送jar包:hadoop-yarn-server-common-2.6.5.jar; 赠送原API文档:hadoop-yarn-server-common-2.6.5-javadoc.jar; 赠送源代码:hadoop-yarn-server-common-2.6.5-sources.jar; 赠送Maven依赖信息文件:hadoop-...
赠送jar包:hadoop-yarn-server-web-proxy-2.6.0.jar; 赠送原API文档:hadoop-yarn-server-web-proxy-2.6.0-javadoc.jar; 赠送源代码:hadoop-yarn-server-web-proxy-2.6.0-sources.jar; 赠送Maven依赖信息文件:...
赠送jar包:hadoop-yarn-api-2.5.1.jar; 赠送原API文档:hadoop-yarn-api-2.5.1-javadoc.jar; 赠送源代码:hadoop-yarn-api-2.5.1-sources.jar; 包含翻译后的API文档:hadoop-yarn-api-2.5.1-javadoc-API文档-...
【标题】"Hadoop YARN Server ResourceManager 2.3.0" Hadoop YARN(Yet Another Resource Negotiator)是Apache Hadoop项目中的一个核心组件,它负责管理集群资源的分配和调度,使得大数据处理任务得以高效执行。...
赠送jar包:hadoop-yarn-client-2.7.3.jar; 赠送原API文档:hadoop-yarn-client-2.7.3-javadoc.jar; 赠送源代码:hadoop-yarn-client-2.7.3-sources.jar; 赠送Maven依赖信息文件:hadoop-yarn-client-2.7.3.pom;...
在Linux系统的Centos7安装hadoop2.9.2版本所需配置的yarn-site.xml配置文件
赠送jar包:hadoop-yarn-api-2.6.5.jar; 赠送原API文档:hadoop-yarn-api-2.6.5-javadoc.jar; 赠送源代码:hadoop-yarn-api-2.6.5-sources.jar; 赠送Maven依赖信息文件:hadoop-yarn-api-2.6.5.pom; 包含翻译后...
赠送jar包:hadoop-yarn-client-2.6.5.jar; 赠送原API文档:hadoop-yarn-client-2.6.5-javadoc.jar; 赠送源代码:hadoop-yarn-client-2.6.5-sources.jar; 赠送Maven依赖信息文件:hadoop-yarn-client-2.6.5.pom;...
hadoop2.7.4安装包补丁包,解决yarn定时调度启动问题!!
赠送jar包:hadoop-yarn-client-2.5.1.jar; 赠送原API文档:hadoop-yarn-client-2.5.1-javadoc.jar; 赠送源代码:hadoop-yarn-client-2.5.1-sources.jar; 赠送Maven依赖信息文件:hadoop-yarn-client-2.5.1.pom;...
赠送jar包:hadoop-yarn-api-2.6.5.jar; 赠送原API文档:hadoop-yarn-api-2.6.5-javadoc.jar; 赠送源代码:hadoop-yarn-api-2.6.5-sources.jar; 赠送Maven依赖信息文件:hadoop-yarn-api-2.6.5.pom; 包含翻译后...
赠送jar包:hadoop-yarn-client-2.5.1.jar; 赠送原API文档:hadoop-yarn-client-2.5.1-javadoc.jar; 赠送源代码:hadoop-yarn-client-2.5.1-sources.jar; 赠送Maven依赖信息文件:hadoop-yarn-client-2.5.1.pom;...
java运行依赖jar包
赠送jar包:hadoop-yarn-server-common-2.5.1.jar; 赠送原API文档:hadoop-yarn-server-common-2.5.1-javadoc.jar; 赠送源代码:hadoop-yarn-server-common-2.5.1-sources.jar; 赠送Maven依赖信息文件:hadoop-...