httpclient 获取到网页内容乱码问题

GhostWolf

浏览: 315273 次
性别:
来自: 广州

最近访客更多访客>>

javawxl

hahaoop

烈酒清风丶

sdksdk1986

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

java综合

Windows Firefox Blog .net

最近在爬一些页面碰到解析的内容有乱码最后这个方法解决了这个乱码问题

public static String getHTMLByDeCode(String url, String... params) throws Exception {
		DefaultHttpClient httpClient = new DefaultHttpClient();
		int index = 0;
		if(ipPortList.size() != 0){
			index = (int) (Math.random() * ipPortList.size() );
			String ipPort = ipPortList.get(index);
			if(!StringUtil.isEmpty(ipPort))
			{
				logger.debug(index+">>>"+ipPort);
				String[] ipPortResult = ipPort.split(":");
				HttpHost proxy = new HttpHost(ipPortResult[0], Integer.parseInt(ipPortResult[1]));//设置代理ip
				httpClient.getParams().setParameter(ConnRoutePNames.DEFAULT_PROXY, proxy);
			}
		}
		HttpProtocolParams.setUserAgent(httpClient.getParams(),"Mozilla/5.0 (Windows; U; Windows NT 5.1; zh-CN; rv:1.9.1.9) Gecko/20100315 Firefox/3.5.9");
		String charset = "UTF-8";
		if (null != params && params.length >= 1) {
			charset = params[0];
		}
		HttpGet httpget = new HttpGet();
		String content = "";
		try{
			httpget.setURI(new java.net.URI(url));
			HttpResponse response = httpClient.execute(httpget);
			HttpEntity entity = response.getEntity();
			if (entity != null) {
				// 使用EntityUtils的toString方法，传递默认编码，在EntityUtils中的默认编码是ISO-8859-1
				content = EntityUtils.toString(entity, charset);
				httpget.abort();
				httpClient.getConnectionManager().shutdown();
			}
		}
		catch(Exception e){
			if(ipPortList.size() != 0)
				ipPortList.remove(index);
			
			e.printStackTrace();
			logger.debug("get proxy again!!!!");
			getHTMLByDeCode(url,params);
		}
		return content;
	}

参考网址：
http://dh189.iteye.com/blog/732111
http://mhqawjh.iteye.com/blog/473450

分享到：

oracle导入导出问题 | 删除表中重复的数据

2010-11-18 10:17
浏览 8935
评论(0)
分类:编程语言
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

httpclient 获取到网页内容乱码问题

评论

发表评论

相关推荐

最近访客 更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

httpclient 获取到网页内容乱码问题

评论

发表评论

相关推荐

java中hex转byte问题

jstack用法

keytool 生成csr

单例模式demo

com.caucho.server.connection.RequestWrapper.isSecure NullPointerException

最近访客更多访客>>