下面是hadoop发布版本, bin目录下面的hadoop命令的源码,hadoop命令支持好多种参数,一直记不住,想通过精度这部分代码,能记住部分参数.
#!/usr/bin/env bash # Licensed to the Apache Software Foundation (ASF) under one or more # contributor license agreements. See the NOTICE file distributed with # this work for additional information regarding copyright ownership. # The ASF licenses this file to You under the Apache License, Version 2.0 # (the "License"); you may not use this file except in compliance with # the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # This script runs the hadoop core commands. #这3行命令的主要目的是,获取Hadoop运行所在目录. bin=`which $0` bin=`dirname ${bin}` bin=`cd "$bin"; pwd` #定位找到 hadoop-config.sh 文件,里面包含了很多Hadoop命令的配置文件. #先找HADOOP_LIBEXEC_DIR目录,如果没有定义,就使用默认的路径,也就是hadoop根目录下面的libexec DEFAULT_LIBEXEC_DIR="$bin"/../libexec HADOOP_LIBEXEC_DIR=${HADOOP_LIBEXEC_DIR:-$DEFAULT_LIBEXEC_DIR} . $HADOOP_LIBEXEC_DIR/hadoop-config.sh function print_usage(){ echo "Usage: hadoop [--config confdir] COMMAND" echo " where COMMAND is one of:" echo " fs run a generic filesystem user client" echo " version print the version" echo " jar <jar> run a jar file" echo " checknative [-a|-h] check native hadoop and compression libraries availability" echo " distcp <srcurl> <desturl> copy file or directories recursively" echo " archive -archiveName NAME -p <parent path> <src>* <dest> create a hadoop archive" echo " classpath prints the class path needed to get the" echo " Hadoop jar and the required libraries" echo " daemonlog get/set the log level for each daemon" echo " or" echo " CLASSNAME run the class named CLASSNAME" echo "" echo "Most commands print help when invoked w/o parameters." } #如果命令参数个数为0,则打印提示,退出 if [ $# = 0 ]; then print_usage exit fi #解析第1个参数,第0个参数是命令本身 COMMAND=$1 case $COMMAND in # usage flags --help|-help|-h) print_usage exit ;; #hdfs commands namenode|secondarynamenode|datanode|dfs|dfsadmin|fsck|balancer|fetchdt|oiv|dfsgroups) echo "DEPRECATED: Use of this script to execute hdfs command is deprecated." 1>&2 echo "Instead use the hdfs command for it." 1>&2 echo "" 1>&2 #try to locate hdfs and if present, delegate to it. shift if [ -f "${HADOOP_HDFS_HOME}"/bin/hdfs ]; then exec "${HADOOP_HDFS_HOME}"/bin/hdfs ${COMMAND/dfsgroups/groups} "$@" elif [ -f "${HADOOP_PREFIX}"/bin/hdfs ]; then exec "${HADOOP_PREFIX}"/bin/hdfs ${COMMAND/dfsgroups/groups} "$@" else echo "HADOOP_HDFS_HOME not found!" exit 1 fi ;; #mapred commands for backwards compatibility pipes|job|queue|mrgroups|mradmin|jobtracker|tasktracker|mrhaadmin|mrzkfc|jobtrackerha) echo "DEPRECATED: Use of this script to execute mapred command is deprecated." 1>&2 echo "Instead use the mapred command for it." 1>&2 echo "" 1>&2 #try to locate mapred and if present, delegate to it. shift if [ -f "${HADOOP_MAPRED_HOME}"/bin/mapred ]; then exec "${HADOOP_MAPRED_HOME}"/bin/mapred ${COMMAND/mrgroups/groups} "$@" elif [ -f "${HADOOP_PREFIX}"/bin/mapred ]; then exec "${HADOOP_PREFIX}"/bin/mapred ${COMMAND/mrgroups/groups} "$@" else echo "HADOOP_MAPRED_HOME not found!" exit 1 fi ;; #打印出Hadoop执行时的classpath,方便查找classpath的错误 classpath) if $cygwin; then CLASSPATH=`cygpath -p -w "$CLASSPATH"` fi echo $CLASSPATH exit ;; #core commands *) # the core commands if [ "$COMMAND" = "fs" ] ; then CLASS=org.apache.hadoop.fs.FsShell elif [ "$COMMAND" = "version" ] ; then CLASS=org.apache.hadoop.util.VersionInfo elif [ "$COMMAND" = "jar" ] ; then CLASS=org.apache.hadoop.util.RunJar elif [ "$COMMAND" = "checknative" ] ; then CLASS=org.apache.hadoop.util.NativeLibraryChecker elif [ "$COMMAND" = "distcp" ] ; then CLASS=org.apache.hadoop.tools.DistCp CLASSPATH=${CLASSPATH}:${TOOL_PATH} elif [ "$COMMAND" = "daemonlog" ] ; then CLASS=org.apache.hadoop.log.LogLevel elif [ "$COMMAND" = "archive" ] ; then CLASS=org.apache.hadoop.tools.HadoopArchives CLASSPATH=${CLASSPATH}:${TOOL_PATH} elif [[ "$COMMAND" = -* ]] ; then # class and package names cannot begin with a - echo "Error: No command named \`$COMMAND' was found. Perhaps you meant \`hadoop ${COMMAND#-}'" exit 1 else #如果上面的都没匹配上,那么第一个参数作为classname 来解析,比如下面就是一个示例 #hadoop org.apache.hadoop.examples.WordCount /tmp/15 /tmp/46 CLASS=$COMMAND fi #删除$@中的第一个参数,比如"hadoop org.apache.hadoop.examples.WordCount /tmp/15 /tmp/46" #在运行shift之前$@=org.apache.hadoop.examples.WordCount /tmp/15 /tmp/46 #之后$@=/tmp/15 /tmp/46 shift # Always respect HADOOP_OPTS and HADOOP_CLIENT_OPTS # 对应的这两个变量默认的定义在文件hadoop-config.sh,如果要修改启动参数,也可以修改这个文件,比如想开启远程debug HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS" #make sure security appender is turned off HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,NullAppender}" #兼容cygwin模拟器 if $cygwin; then CLASSPATH=`cygpath -p -w "$CLASSPATH"` fi #没什么意思,放在这是为了方便修改扩展CLASSPATH export CLASSPATH=$CLASSPATH exec "$JAVA" $JAVA_HEAP_MAX $HADOOP_OPTS $CLASS "$@" ;; esac
相关推荐
2、大数据环境-安装Hadoop2.5.2伪分布式傻瓜教程 原创
在大数据领域,Hadoop是一个核心的开源框架,用于存储和处理海量数据。本"大数据开发--hadoop全套学习课程"涵盖了大数据技术的多个关键组成部分,包括Hadoop2.x及其生态系统中的其他工具,如Hive、HBase、Flume、...
大数据笔记,包含Hadoop、Spark、Flink、Hive、Kafka、Flume、ZK...... 大数据笔记,包含Hadoop、Spark、Flink、Hive、Kafka、Flume、ZK...... 大数据笔记,包含Hadoop、Spark、Flink、Hive、Kafka、Flume、ZK.......
大数据组件-监控-hadoop-datanodes的prometheus模板插件
大数据组件-监控-hadoop的prometheus-grafana模板插件
大数据高级-hadoop部署
小牛学堂-大数据24期-04-Hadoop Hive Hbase Flume Sqoop-12天适合初学者.txt
1. 大数据环境和Hadoop基础知识:Hadoop是一个由Apache基金会开发的开源框架,允许使用简单的编程模型在成百上千的分布式计算机上存储和处理大数据。它是大数据技术的核心组件,是构建大数据基础环境的基石。 2. ...
【大数据课程-Hadoop集群程序设计与开发-1.Hadoop入门_lk_edit.ppt】课程主要围绕大数据技术中的Hadoop集群程序设计与开发展开,旨在帮助教师进行教学,提供全面的教学资源,包括大纲、教案、实训文档等。...
标题中的“云计算与大数据技术-Hadoop分布式大数据系统”揭示了我们即将探讨的核心主题:Hadoop在云计算环境中的应用,以及它如何处理和分析大规模数据。Hadoop是Apache软件基金会的一个开源项目,它提供了一个...
毕业设计-基于Hadoop+Spark的大数据金融信贷风险控系统源码.zip毕业设计-基于Hadoop+Spark的大数据金融信贷风险控系统源码.zip毕业设计-基于Hadoop+Spark的大数据金融信贷风险控系统源码.zip毕业设计-基于Hadoop+...
在大数据领域,Hadoop 2.0 是一个关键的分布式计算框架,它为海量数据处理提供了强大支持。本文将深入探讨Hadoop 2.0的主要组件、架构、以及其相较于Hadoop 1.0的改进。 一、Hadoop 2.0概述 Hadoop 2.0是Apache软件...
山东大学大数据的课程设计-基于hadoop实现的图书推荐系统源代码+实验报告+数据库,含有代码注释,新手也可看懂,个人手打98分项目,导师非常认可的高分项目,毕业设计、期末大作业和课程设计高分必看,下载下来,...
Hadoop-3.1.3是Hadoop的稳定版本,提供了许多增强的功能和优化,使其更适合大规模分布式计算环境。在这个针对Linux系统的安装包中,我们将探讨Hadoop的基础知识、安装步骤以及在Linux环境下的配置和操作。 Hadoop由...
本资料主要涵盖了大数据开发以及自动化部署相关的技术,包括Hadoop、Hive、HBase、Spark和Storm等关键组件。这些组件构成了一个全面的大数据处理生态系统,各自承担着不同的职责。 Hadoop是Apache基金会的一个开源...
大数据系统学习笔记-0002 - Hadoop集群搭建 - 资源包 资源列表: hadoop-2.7.4.tar.gz jdk-8u301-linux-x64.tar.gz zookeeper-3.4.10.tar.gz
Linux运维-运维课程MP4频-06-大数据之Hadoop部署-15hadoop部署分类.mp4