`
Qieqie
  • 浏览: 340964 次
  • 性别: Icon_minigender_1
  • 来自: 北京
文章分类
社区版块
存档分类
最新评论

Paoding 2.0.2记录

阅读更多
Paoding 2.0.2记录

paoding 现在在svn上的代码能够支持 自动动态装载词典,并检测词典是否发生了更新、删除。
也支持关闭自动监测(paoding.stopAutoDetecting),而提供一个方法paoding.forceDetecting手动执行一次检测。

现在这个版本为2.0.2,但是现在不打算打成jar包和zip包。
待之后2.0.3支持简繁体、提供GBK->UTF-8;Big5->utf-8转化功能后再发包。

-------------------------------
2007-9-19:
计划变更:简体繁体从2.0去除,推迟到2.1版;2.0.3版本号留空。下一个发布版本是2.0.4-alpha.
错误观点修正:因为lucene输入的是Reader,此时已经没有编码的问题了,全部都是符合unicode规范的字符了。不管是GBK还是BIG5存储的文件转化为Reader后,就没有编码的概念了。所以庖丁不存在GBK->UTF-8的变更。
-------------------------------

2.0.3之后没有特殊原因,不会再增加新的特性或功能了。
之后便是完整测试,并持续发布2.0.4-alpha;-->2.0.4-beta;-->
被**证明**稳定后最终发布2.0.5。

之后除非有严重妨碍使用的bug,否则不再发布新版本。

2.0.5之后的版本将直接跳到2.1.0开始(如果有新特性需要加入才会生版本)。
-------------------------------
2007-9-19:
计划调整:简繁体计划从2.1开始开发
-------------------------------



一个使用手动检测词典变化的例子:
	public static void main(String[] args) throws Exception {
		Paoding paoding = PaodingMaker.make();
		paoding.stopAutoDetecting();//关闭自动词典监测,使用手动检测
		PaodingAnalyzer analyzer = PaodingAnalyzer.defaultMode(paoding);
		int count = 1;
		while (true) {
			paoding.forceDetecting();//分词之前手动强制检测一次
			TokenStream ts = analyzer.tokenStream(
					"", new StringReader("庖丁解牛词典检测"));
			Token token;
			while ((token = ts.next()) != null) {
				System.out.println(token);
			}
			System.out.println("--" + (count ++) + "--");
			Thread.sleep(1000 * 5);
		}
	}


如果要使用自动监测,应该保证有其他线程在运行,否则自动监测没办法进行
(其他线程如果不存在了,那么Paoding自动退出检测,所以一般只能在Web应用中测试Paoding的自动监测)
如果检测到词典变话,可以从日志/控制台中得到消息提示。
分享到:
评论
4 楼 liang1022 2008-11-13  
可以介紹一下Paoding的 開發 環 境 嗎 ?
3 楼 ylangin 2008-04-11  
请教一个分词的问题:

有一段文章,中间含“第七十四军”的文字,客户端尝试搜索“七十四军”,没有结果,
再尝试用paoding带的分词工具分,结果如下:
paoding> 第七十四军;
1:      第七/第七十/4/军/
        分词器net.paoding.analysis.analyzer.PaodingAnalyzer
        内容长度 5字符, 分 4个词
        分词耗时 31ms
--------------------------------------------------
paoding> 七十四军;
1:      74/军/
        分词器net.paoding.analysis.analyzer.PaodingAnalyzer
        内容长度 4字符, 分 2个词
        分词耗时 0ms
--------------------------------------------------
paoding>

建索引的时候送进去的文字是“第七十四军”, 这样搜索时搜索“七十四军”就没有结果了。
不知道有没有什么好的办法?

有两点考虑:
1. 所有的数字串是不是应该单独分出来,不管前面有没有修饰,比如“第”是个修饰;
2. 分次应该是“稳定”的,我的意思是,句子和句子的一部分分词的效果,对句子的一部分来讲是一样的。比如“第七十四军”和“七十四军”,“七十四军”分出“74,军”,“第七十四军”也应该分出这两个词。

谢谢。
2 楼 unkin 2007-10-25  
大侠,现在有一个问题.
在一台内网服务器paoding运行良好,在外网服务器paoding一初始化就没完,到最后一直搞到内存溢出.现贴出打印记录,请指教小子.

Oct 25, 2007 3:34:01 PM net.paoding.analysis.knife.PaodingMaker getProperties
INFO: config paoding analysis from: /home/resin-ee-2.1.16/file:/home/httpd/search/WEB-INF/lib/paoding.jar!/paoding-analysis.properties;/home/resin-ee-2.1.16/file:/home/httpd/search/WEB-INF/lib/paoding.jar!/paoding-analysis-default.properties;/home/resin-ee-2.1.16/file:/home/httpd/search/WEB-INF/lib/paoding.jar!/paoding-analyzer.properties;/home/resin-ee-2.1.16/file:/home/httpd/search/WEB-INF/lib/paoding.jar!/paoding-dic-home.properties;/home/resin-ee-2.1.16/file:/home/httpd/search/WEB-INF/lib/paoding.jar!/paoding-dic-names.properties;/home/resin-ee-2.1.16/file:/home/httpd/search/WEB-INF/lib/paoding.jar!/paoding-knives.properties;/home/resin-ee-2.1.16/file:/home/httpd/search/WEB-INF/lib/paoding.jar!/paoding-knives-user.properties
Oct 25, 2007 3:34:01 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:01 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:01 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:01 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:01 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:01 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:01 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:01 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:01 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:01 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:01 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:01 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:01 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:01 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:01 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:01 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:01 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:01 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:01 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:01 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:01 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:01 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:01 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:01 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:01 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:02 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:02 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:02 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:02 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:02 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:02 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:02 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:02 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:02 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:02 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:02 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:02 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:02 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:02 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:02 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:02 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:02 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:02 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:02 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:02 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:02 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:02 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:02 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:02 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:02 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:02 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:02 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:02 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:02 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:02 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:02 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:02 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:02 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:02 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:02 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:02 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:02 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:02 PM net.paoding.analysis.knife.PaodingMaker getProperties
INFO: config paoding analysis from: /home/resin-ee-2.1.16/file:/home/httpd/search/WEB-INF/lib/paoding.jar!/paoding-analysis.properties;/home/resin-ee-2.1.16/file:/home/httpd/search/WEB-INF/lib/paoding.jar!/paoding-analysis-default.properties;/home/resin-ee-2.1.16/file:/home/httpd/search/WEB-INF/lib/paoding.jar!/paoding-analyzer.properties;/home/resin-ee-2.1.16/file:/home/httpd/search/WEB-INF/lib/paoding.jar!/paoding-dic-home.properties;/home/resin-ee-2.1.16/file:/home/httpd/search/WEB-INF/lib/paoding.jar!/paoding-dic-names.properties;/home/resin-ee-2.1.16/file:/home/httpd/search/WEB-INF/lib/paoding.jar!/paoding-knives.properties;/home/resin-ee-2.1.16/file:/home/httpd/search/WEB-INF/lib/paoding.jar!/paoding-knives-user.properties
Oct 25, 2007 3:34:02 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:02 PM net.paoding.analysis.knife.PaodingMaker getProperties
INFO: config paoding analysis from: /home/resin-ee-2.1.16/file:/home/httpd/search/WEB-INF/lib/paoding.jar!/paoding-analysis.properties;/home/resin-ee-2.1.16/file:/home/httpd/search/WEB-INF/lib/paoding.jar!/paoding-analysis-default.properties;/home/resin-ee-2.1.16/file:/home/httpd/search/WEB-INF/lib/paoding.jar!/paoding-analyzer.properties;/home/resin-ee-2.1.16/file:/home/httpd/search/WEB-INF/lib/paoding.jar!/paoding-dic-home.properties;/home/resin-ee-2.1.16/file:/home/httpd/search/WEB-INF/lib/paoding.jar!/paoding-dic-names.properties;/home/resin-ee-2.1.16/file:/home/httpd/search/WEB-INF/lib/paoding.jar!/paoding-knives.properties;/home/resin-ee-2.1.16/file:/home/httpd/search/WEB-INF/lib/paoding.jar!/paoding-knives-user.properties
Oct 25, 2007 3:34:02 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:02 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:02 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:02 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:02 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:03 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:04 PM net.paoding.analysis.knife.PaodingMaker getProperties
INFO: config paoding analysis from: /home/resin-ee-2.1.16/file:/home/httpd/search/WEB-INF/lib/paoding.jar!/paoding-analysis.properties;/home/resin-ee-2.1.16/file:/home/httpd/search/WEB-INF/lib/paoding.jar!/paoding-analysis-default.properties;/home/resin-ee-2.1.16/file:/home/httpd/search/WEB-INF/lib/paoding.jar!/paoding-analyzer.properties;/home/resin-ee-2.1.16/file:/home/httpd/search/WEB-INF/lib/paoding.jar!/paoding-dic-home.properties;/home/resin-ee-2.1.16/file:/home/httpd/search/WEB-INF/lib/paoding.jar!/paoding-dic-names.properties;/home/resin-ee-2.1.16/file:/home/httpd/search/WEB-INF/lib/paoding.jar!/paoding-knives.properties;/home/resin-ee-2.1.16/file:/home/httpd/search/WEB-INF/lib/paoding.jar!/paoding-knives-user.properties
Oct 25, 2007 3:34:04 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:06 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:06 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:07 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:07 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:07 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:07 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:09 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:10 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:10 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:14 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:14 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:34:22 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
java.lang.OutOfMemoryError: Java heap space
Oct 25, 2007 3:36:00 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
Oct 25, 2007 3:36:00 PM net.paoding.analysis.knife.FileDictionaries loadAllWordsIfNecessary
INFO: loading dictionaries from /home/httpd/search/WEB-INF/classes/dic
java.lang.OutOfMemoryError: Java heap space
1 楼 shguan 2007-09-11  
GBK->UTF-8

可以借助 Apache Commons-IO 项目中提供的实用工具来编写代码。
    /* gbkString 为一 GBK 编码的字符串 */
    String utf8String = IOUtils.toString(IOUtils.toInputStream(gbkString, "UTF-8"));
utf8String 中字符,皆变为 UTF-8 编码。

附,com.apache.commons.io.IOUtils 中相关代码如下:
    /**
     * Convert the specified string to an input stream, encoded as bytes
     * using the specified character encoding.
     * <p>
     * Character encoding names can be found at
     * <a href="http://www.iana.org/assignments/character-sets">IANA</a>.
     *
     * @param input the string to convert
     * @param encoding the encoding to use, null means platform default
     * @throws IOException if the encoding is invalid
     * @return an input stream
     * @since Commons IO 1.1
     */
    public static InputStream toInputStream(String input, String encoding) throws IOException {
        byte[] bytes = encoding != null ? input.getBytes(encoding) : input.getBytes();
        return new ByteArrayInputStream(bytes);
    }

相关推荐

    lucene中文分词(庖丁解牛)庖丁分词

    《Lucene中文分词——庖丁解牛》 在自然语言处理领域,中文分词是基础且关键的一环。在Java开发中,Apache Lucene是一个强大的全文搜索引擎库,但默认并不支持中文,这就需要借助第三方分词工具。本文将深入探讨...

    庖丁解牛 中文分词工具

    "庖丁解牛"中文分词工具是由一名热爱开源的开发者精心打造的,其目的是为了优化和简化中文文本的处理流程,它支持版本号为2.0.4-alpha2,专为满足中文信息处理需求而设计。这款分词工具的核心技术将中文文本中的连续...

    Auto.js庖丁_3.5.0.apk

    Auto.js庖丁_3.5.0

    庖丁分词.jar

    庖丁分词.jar 庖丁分词.jar 庖丁分词.jar 庖丁分词.jar

    庖丁分词jar包

    庖丁分词是一款高效、灵活且易用的中文分词工具,主要针对Java平台设计。在中文信息处理领域,分词是基础性的工作,它将连续的汉字序列切分成具有语义的词汇,为后续的文本分析、信息检索、情感分析等任务提供支持。...

    庖丁解牛工具

    "庖丁解牛工具"是一款基于Java开发的文本分析工具,尤其在中文分词领域有着广泛的应用。这个工具的名字来源于中国古代寓言故事“庖丁解牛”,寓意对文本的精细处理和深入理解,就像庖丁对牛肉的熟练切割一样。在IT...

    Auto.js庖丁3.2.0最新版.rar

    Auto.js庖丁是一款基于JavaScript的自动化工具,专为Android设备设计,允许用户编写脚本来实现各种自动化的任务。3.2.0版本是该软件的一个更新版本,可能包含了一些新功能、性能优化或修复了已知问题。在当前场景中...

    庖丁分词jar包和dic目录

    标题中的“庖丁分词jar包和dic目录”指的是一个用于中文分词处理的软件工具,其中包含了必要的jar包和字典文件。庖丁分词是基于Java开发的一个高效、可扩展的中文分词库,它借鉴了Lucene的分词技术,并在此基础上...

    庖丁解马--木马查杀深度剖析

    庖丁解马--木马查杀深度剖析,学习此教程后大部分木马可以手动查杀

    lucene 中文分词 庖丁解牛

    《Lucene中文分词:庖丁解牛》 在信息技术高速发展的今天,全文搜索引擎已经成为网站内容检索不可或缺的一部分。其中,Apache Lucene作为一个开源的全文检索库,被广泛应用于各种项目中,尤其对于处理中文文本,...

    autojs庖丁3.2.0.rar

    《AutoJS庖丁3.2.0加密详解》 在移动应用开发领域,尤其是自动化脚本编写中,AutoJS是一款非常流行的JavaScript编程工具,它允许用户在Android设备上编写脚本来实现各种自动化任务。提到“庖丁3.2.0 加密”,这很...

    庖丁解牛分词之自定义词库、庖丁解牛配置

    "庖丁解牛分词"是一个针对中文文本的分词工具,它借鉴了中国古代庖丁解牛的故事,寓意对文本进行精细、深入的剖析。这个工具的主要目标是帮助开发者更准确地切分中文句子,提取关键信息,从而提升搜索效率或理解文本...

    Linux驱动开发庖丁解牛系类

    "Linux驱动开发庖丁解牛系列"很可能是一个深入解析Linux驱动程序开发的教程或者一系列文档,旨在帮助开发者逐步理解并掌握这一复杂而重要的技术领域。 Linux驱动开发主要包括以下几个关键知识点: 1. **内核结构...

    lucene3.0庖丁+索引搜索程序

    《深入剖析Lucene3.0:庖丁解牛与索引搜索实践》 在IT行业中,搜索引擎技术扮演着至关重要的角色,而Lucene作为一个开源全文检索库,为开发者提供了强大的文本搜索功能。本文将深入探讨Lucene3.0版本,结合“庖丁解...

    庖丁解牛分词源码

    "庖丁解牛分词器"是一款著名的中文分词工具,源自开源社区,因其高效的性能和灵活的应用场景而广受欢迎。在深入理解其源码的过程中,我们可以了解到许多关于自然语言处理(NLP)和Java编程的知识点。 1. **中文分词...

    全文检索(庖丁解牛)

    在这个场景中,我们提到的"庖丁解牛"是一种比喻,原出自《庄子·养生主》,意指庖丁(即厨师)对牛的身体结构了如指掌,能轻松分解牛肉。在这里,"庖丁解牛"被用来形容一种精细的分词方法,它可能是指在进行全文检索...

    经典的庖丁解牛通达信主图指标通达信指标公式源码.doc

    标题“经典的庖丁解牛通达信主图指标通达信指标公式源码.doc”表明该资源是一份关于通达信指标公式的经典实现,名称“庖丁解牛”来自中国古典小说《庄子》,指的是一位名叫庖丁的厨师,善于解牛,象征着该指标公式的...

    庖丁解牛分词 java包

    "庖丁解牛分词" 是一款针对中文文本处理的分词工具,主要适用于Java环境。这个工具包的名称形象地借用中国古代故事“庖丁解牛”,寓意对文本的精细处理,如同庖丁对牛肉的熟练分解。在Java开发中,分词是自然语言...

    庖丁解牛jarbao

    "庖丁解牛jarbao"是一个专为中文分词设计的工具,它的核心是"庖丁解牛中文分词器"。在Java开发环境中,它通常以jar包的形式提供,如"paoding-analysis - 3.1.jar",这表明它是基于Java语言实现的,并且是版本3.1的...

    paoding analysis 3.0.1 jar (庖丁解牛分词器)

    由于庖丁官方目前提供可下载尚不支持Lucene 3.0以上版本。因此作者对paoding进行重新编译,使其与最新Lucene 3.0.1版本适用。 Latest paoding 3.0.1 for lucene 3.0.1 使用说明: 先下载2.0.4的版本(h t t p : / ...

Global site tag (gtag.js) - Google Analytics