hadoop 运行自带包的单词计数位置和写法

chengjianxiaoxue

浏览: 1323915 次
性别:
来自: 北京

最近访客更多访客>>

liu_shui8

happy2012

nddht

yhtppp

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

hadoop2

0 准备文件 test 内容如下，中间用 \t间隔

[root@hadoop3 ~]# cat test 
hello   you
hello   me

1 找到如下路径

hadoop2.5.2/share/hadoop/mapreduce: 位置下找到 example.jar

2 执行如下命令：

[root@hadoop3 mapreduce]# hadoop jar hadoop-mapreduce-examples-2.5.2.jar   wordcount /input/test /output

其中，如果不知道能运行的主函数名称可以使用：

hadoop jar hadoop-mapreduce-examples.jar 然后回车

此时会提示可供调用的主函数名词, eg:

[root@hadoop3 mapreduce]# hadoop jar hadoop-mapreduce-examples-2.5.2.jar
An example program must be given as the first argument.
Valid program names are:
aggregatewordcount: An Aggregate based map/reduce program that counts the words in the input files.
aggregatewordhist: An Aggregate based map/reduce program that computes the histogram of the words in the input files.
bbp: A map/reduce program that uses Bailey-Borwein-Plouffe to compute exact digits of Pi.
dbcount: An example job that count the pageview counts from a database.
distbbp: A map/reduce program that uses a BBP-type formula to compute exact bits of Pi.
grep: A map/reduce program that counts the matches of a regex in the input.
join: A job that effects a join over sorted, equally partitioned datasets
multifilewc: A job that counts words from several files.
pentomino: A map/reduce tile laying program to find solutions to pentomino problems.
pi: A map/reduce program that estimates Pi using a quasi-Monte Carlo method.
randomtextwriter: A map/reduce program that writes 10GB of random textual data per node.
randomwriter: A map/reduce program that writes 10GB of random data per node.
secondarysort: An example defining a secondary sort to the reduce.
sort: A map/reduce program that sorts the data written by the random writer.
sudoku: A sudoku solver.
teragen: Generate data for the terasort
terasort: Run the terasort
teravalidate: Checking results of terasort
wordcount: A map/reduce program that counts the words in the input files.
wordmean: A map/reduce program that counts the average length of the words in the input files.
wordmedian: A map/reduce program that counts the median length of the words in the input files.
wordstandarddeviation: A map/reduce program that counts the standard deviation of the length of the words in the input files.

运行结果如下：

hello	2
me	1
you	1

分享到：

安装sqlserver2005下别的机器连接遭拒co ... | hive查看建立表的类型是内部还是外部表

2015-04-22 17:35
浏览 1420
评论(0)
分类:互联网
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论