Nutch配置全过程 -

muzhimin

浏览: 23534 次
性别:
来自: 杭州

最近访客更多访客>>

博主相关

博客

微博

相册

留言

关于我

文章分类

全部博客 (12)

社区版块

存档分类

Nutch配置全过程

Java Tomcat 软件测试 C C++

Nutch配置全过程

安装各软件,并设置环境变量.
    辅助软件1:cygwin的安装见上篇文章,cygwin基础入门,我安装在了e:\cygwin.安装后在桌面有个快截方式.
    辅助软件2:jdk安装在C:\Program Files\Java\jdk1.5.0,所以环境变量设置成为JAVA_HOME=C:\Program Files\Java\jdk1.5.0
    辅助软件3:tomcat安装在e:\tomcat 6.0
    nutch不用安装,是个应用程序,下载后为nutch-0.9.tar.gz,双击桌面上的cygwin快捷方式;执行以下命令:
$ cd D:/Downloads/Soft
$ tar zxvf nutch-1.0.tar.gz
在e盘下面出现nutch-0.9文件夹说明解压成功了.然后环境变量设置为NUTCH_JAVA_HOME=C:\Program Files\Java\jdk1.5.0(也就是说跟JAVA_HOME是相同的).测试nutch是否安装成功,只需要执行以下命令:
$cd D:/Downloads/Soft/nutch-1.0/bin
$sh nutch
出现下面的字样就是安装成功了.
Usage: nutch COMMAND
where COMMAND is one of:
crawl             one-step crawler for intranets
admin             database administration, including creation
inject            inject new urls into the database
generate          generate new segments to fetch
fetchlist         print the fetchlist of a segment
fetch             fetch a segment's pages
dump              dump a segment's pages
index             run the indexer on a segment's fetcher output
merge             merge several segment indexes
dedup             remove duplicates from a set of segment indexes
updatedb          update database from a segment's fetcher output
mergesegs         merge multiple segments into a single segment
readdb            examine arbitrary fields of the database
analyze           adjust database link-analysis scoring
server            run a search server
or
CLASSNAME         run the class named CLASSNAME
Most commands print help when invoked w/o parameters.
#

分享到：