判定文件编码或文本流编码的方法

zzg810314

浏览: 147371 次
性别:
来自: 济南

最近访客更多访客>>

konoha

waterenjoy

woodding2008

xzl_xzl

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

java

Java .net XML HTML

import info.monitorenter.cpdetector.io.ASCIIDetector;
import info.monitorenter.cpdetector.io.CodepageDetectorProxy;
import info.monitorenter.cpdetector.io.JChardetFacade;
import info.monitorenter.cpdetector.io.ParsingDetector;
import info.monitorenter.cpdetector.io.UnicodeDetector;

import java.io.ByteArrayInputStream;
import java.io.File;
import java.io.IOException;
import java.io.InputStream;
import java.net.MalformedURLException;
import java.nio.charset.Charset;

/**
 * <p>
 * 本类用来探测字符的编码集,关返回其名称
 * </p>
 * 
 * @ * @vision 1.0
 */
public class Detector {
	/*------------------------------------------------------------------------ 
	  detectorProxy是探测器，它把探测任务交给具体的探测实现类的实例完成。 
	  cpDetector内置了一些常用的探测实现类，这些探测实现类的实例可以通过add方法 
	  加进来，如ParsingDetector、 JChardetFacade、ASCIIDetector、UnicodeDetector。   
	  detector按照“谁最先返回非空的探测结果，就以该结果为准”的原则返回探测到的 
	  字符集编码。 
	--------------------------------------------------------------------------*/
	private static CodepageDetectorProxy detectorProxy;
	static {
		detectorProxy = CodepageDetectorProxy.getInstance();
		/*------------------------------------------------------------------------- 
		  ParsingDetector可用于检查HTML、XML等文件或字符流的编码,构造方法中的参数用于 
		  指示是否显示探测过程的详细信息，为false不显示。 
		---------------------------------------------------------------------------*/
		detectorProxy.add(new ParsingDetector(false));
		/*-------------------------------------------------------------------------- 
		  JChardetFacade封装了由mozilla1组织提供的JChardet，它可以完成大多数文件的编码 
		  测定。所以，一般有了这个探测器就可满足大多数项目的要求，如果你还不放心，可以 
		  再多加几个探测器，比如下面的ASCIIDetector、UnicodeDetector等。 
		 ---------------------------------------------------------------------------*/
		detectorProxy.add(JChardetFacade.getInstance());
		// ASCIIDetector用于ASCII编码测定
		detectorProxy.add(ASCIIDetector.getInstance());
		// UnicodeDetector用于unicode1家族编码的测定
		detectorProxy.add(UnicodeDetector.getInstance());

	}

	public static synchronized String getEncodingType(String content)
			throws IllegalArgumentException, IOException {
		ByteArrayInputStream stream = new ByteArrayInputStream(content
				.getBytes());
		return Detector.getEncodingType(stream, content.length());
	}

	public static synchronized String getEncodingType(File file)
			throws MalformedURLException, IOException {
		Charset charset = detectorProxy.detectCodepage(file.toURL());
		if (charset != null) {
			return charset.name();
		} else
			return "未知";
	}

	public static synchronized String getEncodingType(InputStream inputStream,
			int length) throws IllegalArgumentException, IOException {
		Charset charset = detectorProxy.detectCodepage(inputStream, length);
		if (charset != null) {
			return charset.name();
		} else
			return "未知";
	}
}

分享到：

.net dll破解 | myeclipse及其相关的tomcat内存设置

2010-01-08 10:02
浏览 1620
评论(0)
分类:编程语言
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

判定文件编码或文本流编码的方法

评论

发表评论

相关推荐

最近访客 更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

判定文件编码或文本流编码的方法

评论

发表评论

相关推荐

java时区问题

(转帖)使用 Spring 更好地处理 Struts 动作三种整合 Struts 应用程序与 Spring 的方式

（转帖）Spring事务配置的五种方式

myeclipse及其相关的tomcat内存设置

直接修改class文件

tomcat内存设置

（zz）java集合类总结

（转帖）JAVA---事件适配器----用内部类,匿名类实现事件处理

java匿名类

comparable和Comparator区别

最近访客更多访客>>