`
sunwinner
  • 浏览: 203761 次
  • 性别: Icon_minigender_1
  • 来自: 上海
社区版块
存档分类
最新评论

Install and configurate Hadoop 1.1.1 on OS X

 
阅读更多

As homework of the Hadoop workshop, I keep it as a note here.

 

Hadoop install steps:

$ sudo cp ~/Downloads/hadoop-1.1.1.tar.gz ~/dev/hadoop-1.1.1.tar.gz
$ sudo tar -xvzf hadoop-1.1.1.tar.gz

 

Env variable setup:

cat - >> ~/.zshrc
export HADOOP_HOME_WARN_SUPPRESS=1
export HADOOP_OPTS="-Djava.security.krb5.realm=OX.AC.UK -Djava.security.krb5.kdc=kdc0.ox.ac.uk:kdc1.ox.ac.uk"
export HADOOP_HOME="/Users/gsun/dev/hadoop-1.1.1"
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

 

Press Ctrl-D to exit.

 

 

Also as an alternative, you can install Hadoop thru HomeBrew. Here is the instruction of how to install HomeBrew, it's really awesome. https://github.com/mxcl/homebrew/wiki/installation

First, view detailed information of Hadoop version in HomeBrew repository:

# gsun at MacBookPro in ~/prog/hadoop/hadoop-guide on git:master o [18:26:25]
$ brew info hadoop
hadoop: stable 1.1.2
http://hadoop.apache.org/
Not installed
From: https://github.com/mxcl/homebrew/commits/master/Library/Formula/hadoop.rb
==> Caveats
In Hadoop's config file:
  /usr/local/Cellar/hadoop/1.1.2/libexec/conf/hadoop-env.sh
$JAVA_HOME has been set to be the output of:
  /usr/libexec/java_home

 Then install:

brew install hadoop 

 

 

Hadoop env setup:

As you may already knew, we can configure and use Hadoop in three modes. These modes are:

 

1. Standalone mode

This mode is the default mode that you get when you’re downloading and extracting Hadoop for the first time. In this mode, Hadoop didn’t utilize HDFS to store input and output files. Hadoop just use local filesystem in its process. This mode is very useful for debugging your MapReduce code before you deploy it on large cluster and handle huge amounts of data. In this mode, the Hadoop’s configuration file triplet (mapred-site.xml, core-site.xml, hdfs-site.xml) still free from custom configuration.

 

2. Pseudo distributed mode (or single node cluster)

In this mode, we configure the configuration triplet to run on a single cluster. The replication factor of HDFS is one, because we only use one node as Master Node, Data Node, Job Tracker, and Task Tracker. We can use this mode to test our code in the real HDFS without the complexity of fully distributed cluster. I’ve already covered the configuration process on my previous post.

 

3. Fully distributed mode (or multiple node cluster)

In this mode, we use Hadoop at its full scale. We can use cluster consists of a thousand nodes working together. This is the production phase, where your code and data are used and distributed across many nodes. You use this mode when your code is ready and work properly on the previous mode.

 

So how could we switching among the three mode, here's trick. We will separate the Hadoop’s configuration directory (conf/) for each mode. Let’s assume that you just extracted your Hadoop distribution and haven’t made any changes on the configuration triplet. In the terminal, write these commands:

# gsun at MacBookPro in ~ [18:38:57]
$ cd $HADOOP_HOME

# gsun at MacBookPro in ~/dev/hadoop-1.1.1 [18:39:05]
$ cp -R conf conf.standalone

# gsun at MacBookPro in ~/dev/hadoop-1.1.1 [18:39:32]
$ cp -R conf conf.pseudo

# gsun at MacBookPro in ~/dev/hadoop-1.1.1 [18:40:00]
$ cp -R conf conf.distributed

# gsun at MacBookPro in ~/dev/hadoop-1.1.1 [18:40:22]
$ rm -R conf

 

 

Now if you want to switch to pseudo mode, do this:

# gsun at MacBookPro in ~/dev/hadoop-1.1.1 [18:40:53]
$ ln -s conf.pseudo conf

 

 

Configurate hadoop as pseudo distribution, you have to edit Hadoop's configuration file triplet: (mapred-site-xml, core-site.xml and hdfs-site.xml).

 

1. mapred-site.xml

 

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
	<property>
		<name>mapred.job.tracker</name>
		<value>localhost:8021</value>
	</property>

	<property>
		<name>mapred.child.env</name>
		<value>JAVA_LIBRARY_PATH=/Users/gsun/dev/hadoop-1.1.1/lib/native</value>
	</property>
	
	<property>
		<name>mapred.map.output.compression.codec</name>
		<value>com.hadoop.compression.lzo.LzoCodec</value>
	</property>
</configuration>

 2. core-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
	<property>
		<name>fs.default.name</name>
		<value>hdfs://localhost</value>
	</property>

	<property>
		<name>hadoop.tmp.dir</name>
		<value>/Volumes/MacintoshHD/Users/puff/prog/hadoop/hadoop-data</value>
		<description>A base for other temporary directories.</description>
	</property>

  	<property>
  		<name>io.file.buffer.size</name>
  		<value>131072</value>
  	</property>
</configuration>

 

 

 3.hdfs-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
	<property>
		<name>dfs.replication</name>
		<value>1</value>
	</property>
	<property>
    	<name>dfs.permissions</name>
    	<value>false</value>
  	</property>
</configuration>

 

 

Now try hadoop in your terminal:

# gsun at MacBookPro in ~/dev/hadoop-1.1.1 [18:50:24]
$ hadoop
Usage: hadoop [--config confdir] COMMAND
where COMMAND is one of:
  namenode -format     format the DFS filesystem
  secondarynamenode    run the DFS secondary namenode
  namenode             run the DFS namenode
  datanode             run a DFS datanode
  dfsadmin             run a DFS admin client
  mradmin              run a Map-Reduce admin client
  fsck                 run a DFS filesystem checking utility
  fs                   run a generic filesystem user client
  balancer             run a cluster balancing utility
  fetchdt              fetch a delegation token from the NameNode
  jobtracker           run the MapReduce job Tracker node
  pipes                run a Pipes job
  tasktracker          run a MapReduce task Tracker node
  historyserver        run job history servers as a standalone daemon
  job                  manipulate MapReduce jobs
  queue                get information regarding JobQueues
  version              print the version
  jar <jar>            run a jar file
  distcp <srcurl> <desturl> copy file or directories recursively
  archive -archiveName NAME -p <parent path> <src>* <dest> create a hadoop archive
  classpath            prints the class path needed to get the
                       Hadoop jar and the required libraries
  daemonlog            get/set the log level for each daemon
 or
  CLASSNAME            run the class named CLASSNAME
Most commands print help when invoked w/o parameters.

# gsun at MacBookPro in ~/dev/hadoop-1.1.1 [18:50:26]
$ hadoop -version
java version "1.7.0_25"
Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)

 

 

Start all of hadoop components:

# gsun at MacBookPro in ~/dev/hadoop-1.1.1 [18:51:59]
$ start-all.sh 
starting namenode, logging to /Users/gsun/dev/hadoop-1.1.1/libexec/../logs/hadoop-gsun-namenode-MacBookPro.local.out
localhost: starting datanode, logging to /Users/gsun/dev/hadoop-1.1.1/libexec/../logs/hadoop-gsun-datanode-MacBookPro.local.out
localhost: 2013-07-02 18:52:05.346 java[2265:1b03] Unable to load realm info from SCDynamicStore
localhost: starting secondarynamenode, logging to /Users/gsun/dev/hadoop-1.1.1/libexec/../logs/hadoop-gsun-secondarynamenode-MacBookPro.local.out
starting jobtracker, logging to /Users/gsun/dev/hadoop-1.1.1/libexec/../logs/hadoop-gsun-jobtracker-MacBookPro.local.out
localhost: starting tasktracker, logging to /Users/gsun/dev/hadoop-1.1.1/libexec/../logs/hadoop-gsun-tasktracker-MacBookPro.local.out

 

 

Screenshot of Hadoop JobTracker activitity:

 

 

  • 大小: 129.5 KB
分享到:
评论

相关推荐

    kuka Communication configurate.ppt

    kuka Communication configurate.ppt 库卡通讯配置 - 基本接口(必须有) 集成安全接口; 安全急停, 安全门等. - X11 - ProfiSafe - CIP-Safety - FSoE - 可能的接口(选配) 可能的通讯接口; IO是否需要? 总线是否...

    Configurate:Java应用程序的简单配置库,提供节点结构,多种格式和转换工具

    Configurate是Java应用程序的简单配置库,它提供了基于节点的数据表示形式,能够处理多种配置格式。 想和我们谈谈配置吗? 加入我们在的#dev频道中,或在我们的(新!)页面上开始。 当前支持的格式为: 项目结构 ...

    configurate-bar.rar_QR码识别_labview 识别_labview二维码_条形码二维码_条码

    标题"configurate-bar.rar"暗示了一个配置或设置相关的项目,可能是一个包含LabVIEW程序的压缩文件,用于处理条码和二维码的识别。"QR码识别_labview 识别_labview二维码_条形码二维码_条码"这些标签进一步明确了这...

    sarticle_html.rar_html软件介绍

    软件介绍 由于是开源的,很容易进行修改。要什么功能随便加进去就行了。 比如说加 公告 ,在setting里增加一个text字段,稍微改一下configurate.php就行...admin/install.php执行安装,安装完成请删除install.php文件。

    Useful doc about DNS

    it's a useful manual to configurate juniper firewall

    Davinci Configurator

    《使用DaVinci Configurator整合AUTOSAR模块的教程》 在现代汽车电子系统中,AUTOSAR(AUTomotive Open System ARchitecture)是一种广泛采用的软件架构,旨在提高软件组件的复用性和可移植性。...

    sarticle_html.rar_WEB开发_HTML_

    由于是开源的,很容易进行修改。要什么功能随便加进去就行了。 比如说加 公告 ,在setting里增加一个text字段,稍微改一下configurate.php就行了。 admin/install.php执行安装,安装完成请删除install.php文件。

    单片机控制的温度传感器C语言程序.doc

    Configurate函数用于配置温度传感器的9位分辨率。该函数首先将EA口拉低, Resetpulse函数产生复位脉冲,然后写入配置命令。 9. StartConvert函数 StartConvert函数用于启动温度转换。该函数首先 Resetpulse函数...

    C2修复工具【银灿916】固件

    16="NAND configurate SSD Info. error" 17="NAND configurate device error" 18="NAND configurate project info. error" 19="NAND read PINF error" 20="NAND write PINF error" 21...

    win7使用简单说明

    - **开始菜单**:通过系统开始菜单中的 "Configurate Tomcat" 和 "Monitor Tomcat" 进行配置和监控。 例如,当在MyEclipse中部署到Tomcat时,如果遇到配置文件的权限问题,可以尝试修改 `Conf` 文件夹的权限,...

    Uart2.rar_tms320

    在描述 "tms320 uart configurate" 中,我们关注的是如何对TMS320 DSP进行UART的设置。配置UART通常包括以下几个关键步骤: 1. **初始化设置**:在使用UART之前,需要设定波特率、数据位数、停止位和奇偶校验位。...

    informatica服务配置整理.pdf

    根据如下路径进入server configure 点击 configurate 进入主界面。 5.1 configure informatica service ---server 页 Server name :服务名。 TCP/IP host : 机器名或 IP 这两部分的内容必须与work flow manager ...

    EMCVNXMirrorView配置基础操作.docx

    通过Manage Multi-Domain Configurate VNX MirrorView,我们可以将两台存储设备添加到同一个Domain,以实现镜像功能。 2. **连接管理**: 添加了存储设备后,我们需要通过Hosts选项来检查和管理它们的连接状态。...

    hal_dma_int.rar_HAL_HAL DMA_dma_hal int_lawrxc

    描述中的“configurate init pes sobaka progrmist loshadi”似乎是一种非标准的表述,但我们可以从中推断出配置初始化、程序编写和问题解决的主题。因此,我们将围绕HAL和DMA的配置、初始化以及在实际编程中遇到的...

    ios-HXEasyCustomShareView 最快的速度自定义各种分享界面.zip

    shareView.configurate(withItems: items) // items 是包含分享内容的数组 shareView.show(from: self.view) ``` 处理分享结果: ```swift shareView.didSelectItem { item, index in // 处理分享操作 } ``...

    linux_kbengine_docker_config:linux环境从svn或下载源码配置多个相互独立kbengine docker工程

    需要修改configurate.py文件的192.168.1.52,修改成服务器本地地址;修改download.py 192.168.1.11:8088改成window开发机地址 在linux上使用docker配置独立kbengine服务 使用docker-compose工具配置 分布式部署应该...

Global site tag (gtag.js) - Google Analytics